It’s that time of year, when music fans, film buffs and tech geeks descend upon Austin for a magical time of shared enthusiasm on the latest in everything from art to technology.
The Datafiniti team will be out in full force, from Thursday, March 8th to Tuesday, March 12th. You’ll be able to find us at the following events:
What: Austin HUG Meetup
Why: The Austin Big Data User Group is holding a meetup at Bazaarvoice’s offices. There will be some great talks on everything from big data platform management to scaling out data pipelines.
Where: Bazaarvoice Offices
When: Thursday, March 8th from 6:30 to 9 pm
What: Austin [Big] Data Party
Why: This annual event hosts top folks from around the data world. There will be discussions on various data technologies and general frivolity! You might even be able to test-drive Datafiniti at the party with the help of our team. If you’re interested in attending, contact us for VIP passes.
When: Sunday, March 11th from 6 to 9 pm
Finally, if you’re interested in grabbing a drink or chatting about what we’re up to, feel free to ping our team @Datafiniti!
In order to collect massive amounts of data, Datafiniti relies on a very unique and powerful web crawling back-end called 80legs, which is built internally (and is available as a stand-alone service!).
80legs operates on the concept of grid computing technology. Volunteer computers all over the world opt-in to participate in a connected grid. Thanks to this architecture, 80legs, and in turn Datafiniti, get access to very cheap bandwidth and compute power, which allows us to easily scale our data collection effort.
Today we are excited to announce a partnership with a major new provider of volunteer computers: Charity Engine. Charity Engine has a unique model in which part of its proceeds (paid by us) go to non-profit charities tasked with saving the world. In essence, folks who donate their idle CPU time to Charity Engine are converting spare CPU cycles into world-saving charitable donations. It’s really quite exciting!
If you’d like to learn more about Charity Engine, we encourage you to check out their website at http://www.charityengine.com and sign up to contribute your computer.
It’s hard to believe it’s only been 2 weeks since our launch! We’ve received a lot of substantive feedback from all sorts of people and have been working hard to rapidly improve Datafiniti based on what we’ve been hearing.
Here are a few updates we’ve made since launch:
- Keyword search is now available as the default search option. You can still use the dfQL-based advanced search by clicking on “Switch to Advanced Search”. Keyword searches give you less control over the data set returned, but are much easier to use.
- More interactive walk-through for advanced dfQL search. The walkthrough provides better feedback when you’re inputting something wrong.
Here are some things in the development pipeline:
- Improving response time. Right now search response time is too inconsistent. Sometimes data sets are generated in 1-2 seconds and sometimes it takes 2 minutes or longer. We’re exploring and testing several different methods of bringing down the response time to ~1 second on average.
- Adding more exploration of data rows. We’ll be adding a link on each row to see all available information for that entity so you can learn more about what data is available in Datafiniti.
- Easier API. We’ll be reducing the # of steps required to work with the API. We’ll try to get it down to a 1-step process.
- More data! We’re aggressively adding more data to fill out our business and people data right now.
Stay tuned! We’re just getting started!
I and the rest of the Datafiniti team are excited to announce the launch of the first search engine for data. This search engine will make it possible for you to tap into the vast store of knowledge and information kept throughout the web.
Unlike other search engines, which provide simple keyword-based queries and are only intended to return single results, Datafiniti lets you enter more structured queries that will generate full data sets taken from the web.
Take a look at the video below to get a sense for how this works:
Currently, Datafiniti lets you search for data on locations (or businesses), people (and social data), products, property (or real estate) and news. It is our commitment to provide the most extensive and robust data around these data types and many others as we move forward.
Although the business and technical challenges of providing such a tool to you greatly excites us, there is a more fundamental belief that has driven the creation of Datafiniti. We believe that Datafiniti represents a significant step forward in the development of a data-driven mindset. We believe that data is the building block of knowledge, and it is our hope that you are able to find knowledge through the data we provide.
Ah, the inaugural post. A bright horizon of uncharted possibilities await for the blog, just as it does for the world of data!
A more in-depth explanation of what Datafiniti is will be coming soon, but for now, let’s provide a brief intro into what we’ll be covering here, on the blog.
We have a profound belief in the value of data and how it can be used to improve all aspects of our lives. Many of our posts will be covering how data is being used in innovative and impactful ways. This will include customer success stories, but it will also include general use-cases we hear about that we think are fascinating.
Our engineering team will be contributing posts on interesting, and sometimes maddening, challenges they face in building a search engine for data. Expect to read about scaling Solr, Cassandra and other big data technologies.
Of course, there will also be regular product announcements as we import new data types, release query features and so on.
That’s all for the inaugural post! Be sure to subscribe to the RSS feed to stay up to date!