We’ve made some big updates to the Datafiniti website! You should go check it out now if you haven’t already. Changes include:
- Focus on business and product data. We’ve decided to focus on daily updates to our business and product data. We’ll circle back with more information about what we’re doing here, but to provide a quick peek.. We’re going to be providing daily updates across hundreds of websites for business and product listings. Our website has been updated to reflect this focus.
- Removed “People” data search. People data is still technically available in our database, but since we are seeing the most interest in our business and product data, we’ve decided to focus on just those two data types for now. People data will likely make a return sometime in the future.
- More content around use-cases. Check out how Datafiniti can be used formonitoring business reviews, monitoring product reviews, and monitoring product prices.
We’re also chunking through some much-needed data fixes. Expect to see several attributes around businesses fixed throughout the month.
On May 18, 2013, we announced that we were moving Datafiniti to Austin. A big reason for our move was the lack of a suitable talent pool from which to hire. As discussed on Xconomy, there was a lot of skepticism (perhaps surprisingly so) around our reasoning. We provided a healthy dose of data to back up our belief that the move to Austin would help us out. Now, after 8 months, I can provide a full breakdown of how our move to Austin benefited our hiring process.
By the numbers
Above you can see a conversion funnel for our hiring process. Once a candidate shows interest in Datafiniti, we have 5 steps to recruit and screen them:
- Intro call: A 30-45 minute call to tell the candidate a bit about our team and what we do, as well as to learn a few basic things about the person’s experience and personality.
- Fizz buzz: A small quiz that tests basic programming concepts.
- Coding challenge: A 1-2 day programming challenge that lets us evaluate someone’s coding style and knack for algorithms and data structures.
- Interview: An in-person, 1/2-day interview that serves as a deep dive into someone’s technical abilities and cultural fit. We also give the candidate plenty of opportunity to learn as much as possible about Datafiniti. Interviews are two-way streets.
- Hiring: If a candidate makes it this far, we’ve extended them an offer. ”Failure” here means they did not accept the offer.
Between May and December, we converted 39 interested candidates into 3 hires. All of this was done without paying for any recruiting services, including job listings.
Differences between Houston & Austin
A larger, more competitive recruiting environment
This is intuitive. There are more developers in Austin, but also more companies trying to hire the same people as us. Because of this, we found more people that matched the listed skill sets for our job openings, but we saw greater drop-offs through the hiring process as compared to what we experienced in Houston.
Our recruiting niche
In Houston, we separated ourselves from other companies hiring developers by pitching ourselves as offering a unique (for Houston) tech startup environment. That was obviously not unique in Austin, but we found that we still offered a unique, albeit different, environment here. Simply put: we offer people the chance to work with a small team that works on big problems. We’re still around 10 people, but we process billions of data points every day. We also offer developers the opportunity to work with and be exposed to a wide variety of technologies. We use multiple programming languages, databases, and algorithms. Our developers get to touch any or all of these tools if they want. All of that is awesome and exciting to candidates.
Streamlining our hiring process
We realized that our hiring process was taking longer than we wanted. Obviously we wanted to do a sufficiently thorough job of screening people, but we felt that for certain candidates, every step wasn’t necessary. It could even be a hindrance to keeping a candidate engaged. When a candidate had a fairly active public code repo, we sometimes skipped the coding challenge. When a candidate came from a very technical background, we sometimes skipped the fizz buzz. We always made sure to test these same concepts during the in-person interview, so we never “degraded” the comprehensiveness of our screening. We just made it faster when we could.
Hiring in Austin is harder than it is in Houston. It’s also more rewarding. We learned a lot about what made us unique. We evolved as a team, and everyone learned how to be better recruiters. Most importantly, we’ve constructed a team that will help us serve our customers and grow our business better than we ever have before.
BTW, if you’re interested in finding out more about what makes working with us so awesome, check out our latest job postings and contact us if you feel like you’d fit in.
We will be deploying V2 of our website and API Monday morning. There will be some downtime between 8 am and 1 pm central time on Monday, but things should be good after that. V1 of our API will no longer be available after this change.
The V2 API provides better functionality, reliability, and performance over the V1 API. You can view initial documentation for it here: http://datafiniti.github.io/developer.datafiniti.net/.
So this is exciting. Within 6 months of moving our team to Austin, we’ve been named by the Austin Chamber of Commerce as one of their 2013 A-List companies. We were included in the “Emerging” category and selected from a group of 157 companies. You can read more here: http://impactnews.com/austin-metro/central-austin/%27a-list%27-companies-named/.
It’s always nice to get recognition, but we still have a lot of work to do to achieve our goal of making web data fully accessible. Earlier this week we gave early access to our V2 API through a new Download App. The new version of our website and API will be made publicly available next week. We’re still considering ourselves to be in beta mode, though, so please give us any feedback you have.
In addition to these upcoming releases, we’re also working on a giant upgrade to our back-end architecture to dramatically improve the volume and rate of web content we’re ingesting. We have a metrics we’re targeting:
- Crawling individual, content-rich websites at a rate of 100,000 URLs per day. Each daily website-specific daily crawl will track high-priority businesses and products to provide daily-updated data on reviews and prices.
- Crawling more than 1 billion URLs every month. This will enhance our web-wide crawling for more content/data discovery on businesses, people, and products.
Meeting these goals is our focus for the next 2 months, so that starting from January, we’ll be in a great position to provide significantly more up-to-date data to our customers.
Exciting news! V2 of our download app is now available at https://github.com/datafiniti/DataDownloader. If you’re an existing Datafiniti customer, you should begin using this download app as soon as possible. We plan to retire the V1 download app sometime next week.
Here are some specific changes you should be aware of when using the V2 download app:
- SOLR vs SQL syntax. The V2 download app use solr query syntax as opposed to a sql-style syntax. If you’re an existing customer, you’ll be receiving an email with sample queries matching your use case to help you construct the new queries. You can learn more about solr syntax here: http://wiki.apache.org/solr/SolrQuerySyntax.
- File format. The V2 API formats files, particularly CSVs, a bit differently. You may need to update your handling of files from Datafiniti accordingly.
- File generation. The V2 API generates files differently from the V1 API. The V2 API creates a single .zip file contains one or more files inside it.
- Purpose of V2 download app. The V2 app is meant to serve as an example application for working with the V2 API. We will not be making regular updates for it, though we will address any immediate bugs that show up. Instead, our hope is that customers will build their own applications on top of the V2 API. We plan on releasing documentation for the V2 API within the next month. We will be providing language-specific API drivers (over time) to help with this as well, which will be supported.
The V2 website is also on its way. It will most likely be available within 2 weeks from today. Any downloads made from the V2 download app will show up under a new “My Downloads” section in the V2 website.
If you do encounter any bugs when using the download app, please let us know. These bugs are likely a result of issues with the V2 API. There are a couple known issues with the V2 API we are trying to isolate and resolve. The download app itself is very light.
Once the V2 website is available, we will be working on transitioning as many existing customers as possible to subscriptions charged directly from the website. We will also be setting up these customers with the appropriate subscription tiers and corresponding credit limits.
Datafiniti V2 will be available next week. This includes:
- Early access to our V2 download app for existing customers.
- Early access to our V2 API for existing customers.
- A new version of our website, with updated pricing information and improved searching.
If you are an existing customer, we will provide sample queries to replace your V1 API queries. The V2 API uses Solr syntax. The documentation is not yet available, but we are working on it right now.
We thought it would be fun to provide some links for your weekend reading enjoyment. Here are some stories that caught our eye recently:
Statistical models can predict a Kickstarter’s Success within 4 hours: Researchers have built an accurate model that can quickly determine how successful a Kickstarter campaign be.
Pricing strategies in the used-car market: One of our customers, Dealerslink, provides insight on why cheap used cars don’t always sell the best.
Customer expectations on Facebook vs Twitter: KISSMetrics details how customer service is different on each social network.
New Download App Coming Next Week
Next week we expect to release a new version of our download app, which customers with API access can use to retrieve data from Datafiniti. The new download app will use the V2 API and replace the current download app, which uses the V1 API. Both download apps (and APIs) will be available for 2 weeks following the release of the new download app. After this time, we will disable the V1 API permanently.
V2 API Documentation
In addition to the deployment of the V2 API, we will release documentation. At first, this documentation will be a bit bare bones, but we will gradually fill it in. The documentation itself will be open source, so users can request changes through pull requests and contribute to the documentation.
Here are a few examples of new data within Datafiniti:
- Approximately 200,000 hotels with room types (out of over 500,000 hotels in total)
- New review sources for business and product data
- Approximately 100,000 new or updated car product data is imported each day
New Data QC and Bug Fix Process
Our team has been discussing how to formalize a “bug fix” process for data within Datafiniti. Next week we’ll be sketching out beginning implementation of a set process for accepting requests for data fixes from customers, issuing fixes, and notifying customers. We’re excited to get this process in place!