For a bigger project which I hope to blog about soon, I needed to get the OpenCV Java Native Library (JNI) running in a Flink stream. It was a pain in the ass, so I’m putting this here to help the next person. First thing I tried… Just doing it. For OpenCV, you need to […]Read more "Using JNIs (like OpenCV) in Flink"
We’re finishing up our series of blogs on providers with YouTube. After this we’ll get down business on building our own social monitoring quasi-app! Getting YouTube Credentials Before accessing any API, we of course need to get our credentials. A good first step in an enterprise like this would be to read the docs, or […]Read more "Watching YouTube Activity with Apache Streams"
This week we’re going to really show off how easy it is to “roll our own” algorithms in Apache Mahout by looking at Eigenfaces. This algorithm is really easy and fun in Mahout because Mahout comes with a first class distributed stochastic singular value decomposition Mahout SSVD. This is going to be a big job, […]Read more "Deep Magic, Volume 3: Eigenfaces"
In out last adventure we got did a basic example of Apache Streams with Twitter data. This week we’re going to extend that example with Facebook data! Also note, if this seems a little light it is because it’s not that different from the last post and the full explanations are there. Our goal here […]Read more "Getting to Know Your Friends with Apache Streams"
Apache Streams is a utility for easily interacting with an ever growing galaxy of social media APIs, collecting data into a common format, and persisting to file or DB.
This post is the first of many to explore this exciting project.Read more "Dipping Your Toes in Apache Streams"
In this post we’re going to really show off the coolest (imho) use-case of Apache Mahout – roll your own distributed algorithms. All of these posts are meant for you to follow-along at home, and it is entirely possible, you don’t have access to a large YARN cluster. That’s OK. Short story- they’re free on […]Read more "Deep Magic Volume2: Absurdly Large OLS with Apache Mahout"
Big Data for n00bs: is a series I am working on of absolute simplest working examples for people just getting started in Big Data. In this post we Explore ‘Gelly’ the graph processing library of Apache Flink.Read more "Big Data for n00bs: Gelly on Apache Flink"