Machine Learning and Bacterial Trajectories
Big Data and analytics is changing the way we see things. It is even changing how we see some of the smallest organisms – Bacteria. Bacteria are really simple creatures. Most of them are like tiny rods...
View ArticleAnalyzing Massive Datasets
As a data scientist at Skytree, I believe that I’m on the cutting edge of Machine Learning and Big Data. And being on the cutting edge means that we do a lot of research and analysis. As a result of...
View ArticleAutomatic Method and Parameter Selection
In the last few months, I’ve found that I can run a lot more experiments in order to make more accurate models with actually less time investment in parameter tuning, using the new techniques of...
View Article10 Billion Row Challenge
As data size increases accuracy increases. One of our mottos here at Skytree is “Bigger Data – Better Results”. This is discussed in detail in a whitepaper that can be found here, but for this post we...
View ArticleSFO Customer Satisfaction Survey
Each year, San Francisco Airport (SFO) conducts a customer satisfaction survey to find out what they are doing well and where they can improve. While there is a lot of information to potentially...
View ArticleAstronomical Data
There are over 100,000,000,000 stars in our Milky Way Galaxy, and over 100,000,000,000 galaxies in the universe. At 1022 stars, that’s more stars than there are grains of sand on all the beaches in the...
View ArticleFeaturizing Data: Spark and Beyond
Leverage the Data Transformation Capabilities in Spark with Machine Learning Challenges Facing Data Engineers and Data Scientists Machine learning as a technology can be challenging. It is difficult to...
View ArticleLearning with Similarity Search
Similarity Search Given a set of objects R, a query q and a notion of similarity between the query and the objects, the task of similarity search (or nearest-neighbour search) is to find the object in...
View ArticleSkytree 15.2 – Better, More Predictive Models
There is little debate that the more data used to train your predictive model, the better the predictive accuracy will be . We generally associate ‘more data’ with a higher volume of the same type of...
View ArticleWhen Most of Your Data is Unstructured
Unstructured text makes up about 80% of the useful business data in many enterprises. Policies, patents, whitepapers, legal documents, marketing materials, memos, notes, email, chat, news and reports...
View Article
More Pages to Explore .....