Wednesday, February 26, 2014

London developers contribute community detection code for GraphLab

More good news: we just got additional code contribution, this time from Michael Leznik and Ryan Toppings. They have implemented the following Community detection algorithm based on label propagation. LPA-algorithm is explained in :

Advanced modularity-specialized label propagation algorithm for detecting communities in networks. X. Liu, T. Murata Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo 152-8552, Japan

The contribution contains Java code for GraphChi Java version. (One can find also C++ code for GraphChi C version )

The required Java files are:

The code contains a toy example, which one might run passing the following parameters to CommunityDetection class which implements GraphChiProgram<Integer, Integer>
graph_test_data.tsv 2 edgelist

The first parameter is graph file name
The second parameter is number of shards used
The third parameter is the type of graph used
Don't forget to link GraphChi to the project 

It outputs the file in this case: graph_test_data.tsv.communities_ids which contains labels id and label ids belonging to this label. Output for this is included in the zip file.

Thanks a lot Michael & Ryan - you contributions are highly appreciated!

Kobo contributes open source LDA inference code for GraphLab

More good news to report! Just heard from our collaborators at Kobo: Curtis Ge and Darius Braziunas that their code contribution is cleared. Kobo released a Latent Dirichlet Allocation (LDA) code which computes inference based on GraphLab LDA model.

In more detail, GraphLab LDA trains the model and classifies existing documents into clusters but when new documents are obtained there is no way for assigning them to the learned classes. Kobo's code classifies new documents based on GraphLab LDA model.

Anyone who wants to test Kobo's code is welcome to try it out. We will soon merge this code into the main GraphLab branch.

Interesting Twitter Graph Analysis & Visualization

I got this from my colleage Zach Nation: an interesting Twitter graph visualization.

Sunday, February 23, 2014

O'Reilly Blog Post on GraphLab SFrames

Fresh out of the oven: Ben Lorica from O'Reilly just published a blog post which gives visibility to our new functionality: SFrames.
Additionally this blog post also appears at Forbes.

Wednesday, February 19, 2014

Drug repurposing using GraphLab

Recently I learned about an interesting work by Murat Can Cobangolu, a graduate student at the CMU-Pitt computational biology institution who is using GraphLab for predicting drug target interaction.
The basic idea is that a new medication creation is a complicated and costly process. The idea is that we can use drugs that are already safe to use for other purposes relative to the ones they were designed. For example, Viagra was design to be heart medication and eventually ended up as a stimulator..

The work is reported in the following paper: 
The construction is rather simple. A matrix of drugs vs. their gene interaction is construction and a matrix factorization methods is built to build a model. Using the model one can predict the interaction of any drug with any gene.

If you like to learn more about this work, and GraphLab applications to Health Care, you are invited to attend our 3rd GraphLab Conference. We have invited Murat to present his work there.

Spotlight: Treato - big data analytics for patient reaction to medication

I stumbled upon this interesting Israeli startup company who is analyzing patient reaction to medications.They have a large dataset of crawled patient text comments about the drugs they are using and the apply some fancy analytics. For example, you can search for a specific medication like Tylanol

Quickly get an overview about it:
Learn how user rank this medication: 
Additionally you can learn about people concerns:

Very cool application utilizing very big data!

Tuesday, February 18, 2014

DARPA* project contributes graphical models toolkit to GraphLab

We are proud to announce that following many months of hard work, Scott Richardson from Vision Systems Inc. has contributed a graphical models toolkit to GraphLab. Here is a some information about their project:

Last year Vision Systems, Inc. (VSI) partnered with Systems & Technology Research (STR) and started working on a DARPA* project to develop intelligent, automatic, and robust computer vision technologies based on realistic conditions. Our goal is to develop a software system that lets users ask queries of photo content, such as "Does this person look familiar?" or "Where is this building located?" If successful, our technology would alert people to scenes that warrant their attention.

We had an immediate need for a solid, scalable graph-parallel computation engine to replace our internal belief propagation implementation. We quickly gravitated to GraphLab. Using this framework, we designed the Factor Graph toolkit based on Joseph Gonzalez's initial implementation. A factor graph, a type of graphical model, is a bipartite graph composed of two types of vertices: variable nodes and factor nodes. The Factor Graph toolkit is able to translate a factor graph into a graphlab distributed-graph and perform inference using a vertex-program which implements the well known message-passing algorithm belief propagation. Both belief propagation and factor graphs are general tools that have applications in a variety of domains. 

We are very excited to get to work on key problems in the Machine Learning/Machine Vision field and to be a part of the powerful communities, like GraphLab, that make it possible.

Below the fold, more about STR and VSI: 
Systems & Technology Research (STR) is a new company focused on developing innovative solutions for challenging information processing problems.  STR staff members bring a deep understanding of state-of-the-art computer vision, signal processing and computer science technologies and the skills to transition them into well-engineered user-oriented products.

Vision Systems, Inc. (VSI) is a start-up Research and Development firm located in Providence, RI. VSI performs research and development in the fields of computer vision and artificial intelligence / machine learning with customers in the intelligence, defense, and Geographic Information Systems (GIS) communities. 

*This material is based upon work supported by DARPA and the United States Air Force Research Laboratory under Contract No. FA8750-12-C-0102. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the United States Air Force.

Read the full text

Saturday, February 1, 2014

Dendrite: Combining Titan+GraphLab into a powerful suite

We just learned about a new open source project from Lab41 called Dendrite.
Dendrite creates a suite for data analytics, where data storage is implemented using Titan distributed graph database. The analytics layer is implemented with GraphLab. On top of that some cool visualization were added to make a point and click interface. Here are some screenshots: