I'm not really sure what to call this, but we have to do some data analysis using hadoop, and in order to be efficient, we apply mutiple classifiers at once to the dataset per job.
There might be stuff for this already somewhere in the builtin libs, but I wanted to apply a map of functions to a bunch of data (variable number of them) and have the result be the key and the transformed data/function result.
Here's the src...
Tuesday, July 7, 2009
Thursday, July 2, 2009
My Emacs setup (for clojure, ecb, etc)
To check out the src, go to github.
So I've been working recently in Clojure with a colleague of mine, and being a Vim user, it's been kind of daunting initially to get started with emacs.
I was looking for a standard install that comes with sane default plugins already installed. I saw ClojureBox, but I wanted it to not be such a black box in case I wanted to put some of my own customizations.
It's a good thing I found Jonas Boner's emacs config on github,
It was a pretty modular setup, and came with some stuff that I wanted like ECB (code browsing).
The third-party plugins are all in a folder, with the startup and loading all in a different one, and you can put an emacs lisp file for whatever you want to happen during emacs startup.
(mapcar 'load-directory
'("$EMACS_LIB/startup"))
I just sanitized and generalized a lot of it, and made the clojure stuff work better, and also grabbed some stuff from Brad Cross's clojure emacs config like clojure-pom, which has slime-project. (It starts with clojure-contrib) It's really easy to get started, go on ahead and take a look at the README to see what I mean. (just 2 steps!)
Some startup keys:
M-x ecb-activate
M-x slime-project (then enter path to project)
C-c C-, (run test-is for clojure on current file)
So I've been working recently in Clojure with a colleague of mine, and being a Vim user, it's been kind of daunting initially to get started with emacs.
I was looking for a standard install that comes with sane default plugins already installed. I saw ClojureBox, but I wanted it to not be such a black box in case I wanted to put some of my own customizations.
It's a good thing I found Jonas Boner's emacs config on github,
It was a pretty modular setup, and came with some stuff that I wanted like ECB (code browsing).
The third-party plugins are all in a folder, with the startup and loading all in a different one, and you can put an emacs lisp file for whatever you want to happen during emacs startup.
(mapcar 'load-directory
'("$EMACS_LIB/startup"))
I just sanitized and generalized a lot of it, and made the clojure stuff work better, and also grabbed some stuff from Brad Cross's clojure emacs config like clojure-pom, which has slime-project. (It starts with clojure-contrib) It's really easy to get started, go on ahead and take a look at the README to see what I mean. (just 2 steps!)
Some startup keys:
M-x ecb-activate
M-x slime-project
C-c C-, (run test-is for clojure on current file)
Wednesday, July 1, 2009
If You Ever Need to Bootstrap Clojure in a non-Clojure environment
Read this thread. http://bit.ly/HocNx
I'm using clojure as a wrapper over cascading, and reading this has been really helpful in finding out how clojure works under the hood.
I'm using clojure as a wrapper over cascading, and reading this has been really helpful in finding out how clojure works under the hood.
Friday, June 26, 2009
Art of War for Business....
Turn their strengths against them.
Or find weaknesses in their strengths, or core competencies.
Or find weaknesses in their strengths, or core competencies.
Sunday, June 21, 2009
Clojure to Java interop gotchas so far
1.) When your package name has a dash (-), in the filesystem, it should be an underscore. (_) .
2.) When calling vararg Java methods or constructors: ie.
public transient Fields(java.lang.Comparable... comparables) { /* compiled code */ }
you have to call it by using into-array in clojure.
(Fields. (into-array Comparable ["1" "2" "3"]))
These couple of things just bit me recently, and they took some time to resolve.
Please post more in the comments or in your own blogs if you find any gotchas!
2.) When calling vararg Java methods or constructors: ie.
public transient Fields(java.lang.Comparable... comparables) { /* compiled code */ }
you have to call it by using into-array in clojure.
(Fields. (into-array Comparable ["1" "2" "3"]))
These couple of things just bit me recently, and they took some time to resolve.
Please post more in the comments or in your own blogs if you find any gotchas!
Good words to work by.
So I have just one wish for you — the good luck to be somewhere where you are free to maintain the kind of integrity I have described, and where you do not feel forced by a need to maintain your position in the organization, or financial support, or so on, to lose your integrity. May you have that freedom.
-Richard Feynman
-Richard Feynman
Saturday, June 20, 2009
Lazy Windowed View into collection for clojure
There's a frequent need in time-series/financial applications and statistical analysis to do a windowed view into a collection of values.
Say you want to compute the mean of every 4 overlapping values of a huge collection/stream, it would look like this:
stream: [1, 2, 3, 4, 5, 6, 7, 8.....n]
result: [(mean [1 2 3 4]) (mean [2 3 4 5]) .... (mean [n-3 n-2 n-1 n])]
you can do this in clojure really easily, in a lazy and memoized way as well to prevent a stackoverflow in case the source stream is too big.
Below is the code:
Say you want to compute the mean of every 4 overlapping values of a huge collection/stream, it would look like this:
stream: [1, 2, 3, 4, 5, 6, 7, 8.....n]
result: [(mean [1 2 3 4]) (mean [2 3 4 5]) .... (mean [n-3 n-2 n-1 n])]
you can do this in clojure really easily, in a lazy and memoized way as well to prevent a stackoverflow in case the source stream is too big.
Below is the code:
Subscribe to:
Posts (Atom)
