
Google on Thursday announced a slew of new features for Google BigQuery, its service for quickly analyzing large amounts of data, to let analytic teams deliver what organizations really need: “actionable and data-driven business insights.” In short, Google has added new capabilities to help businesses work effectively with large amounts of data over a greater range of query and data types.
Here are the three new features Google wants to highlight:
The new Big JOIN feature gives users the ability to produce a result set by merging data from two large tables by a common key: you can skip a data transformation step by simply specifying JOIN operations using SQL. Big Group Aggregations meanwhile significantly increase the number of distinct values that can be grouped in a result set. To use these two new features, all you have to do is add the EACH modifier to JOIN or GROUP BY clauses.
The new TIMESTAMP data type lets you import date and time values in formats familiar to users of databases such as MySQL, while still preserving timezone offset information. There are also new functions for converting these fields into other formats, calculating intervals, and extracting components such as the hour, day of week, and quarter.
Google has also added the ability to add new columns to existing BigQuery tables. To do so, provide a new schema with additional columns using either the “Tables: update” or “Tables: patch” BigQuery API methods.
Last but not least, there are now direct links to individual datasets in the BigQuery Web UI so authorized users can quickly access a dataset, and bookmark it or share it. Email notifications have also been added to inform users when they’ve been given dataset access privileges:
Google explains that these features working in conjunction let you join and perform aggregate analysis on multi-terabyte datasets using SQL-like queries or integrated third-party tools. Without them, the company argues you’d have to initiate complex coding projects, which of course cost both time and money.
In fact, Google is eating its own dog food when it comes to Big Query:
For example, when our App Engine team needed to reconcile app billing and usage information, Big JOIN allowed the team to merge 2TB of usage data with 10GB of configuration data in 60 seconds. Big Group Aggregations enabled them to immediately segment those results by customer.
It’s difficult to argue with figures like that.
See also – Google Cloud Platform gets new storage options, 20% price cut, more European datacenter support and Google debuts four-tiered 24/7 support for its cloud platform services, prices start at $0 to $400 per month
Image credit: Pawel Kryj
View original post here: Google updates BigQuery with SQL-like queries, grouping of distinct values, and support for Timestamp data

Originally created for New York’s Reinvent Green Hackathon to help hackers create maps and data visualizations, these gorgeous, interactive maps of NYC’s green data show off just a small piece of all the data that’s available, but are still quite impressive to play with.
Check them out below, or head here to see them all at once!
A census of street trees by borough yielded datasets with a total of 623,939 trees (Source).
New York City Department of Parks & Recreation parks (Source).
There are over 200 cool roof buildings with a total surface of over a million square feet in New York City. This newly opened dataset identifies them by street address and geo location (Source).
Energy consumption by ZIP code in kWh. Blue is lower consumption, magenta higher consumption (Source).
Rich dataset containing over one million building footprints of New York City, here colored by footprint area (Source).
Small planted areas that are maintained as Greenstreets (Source).
Check out the full site via the link below, and vote for the winning hacks from the Reinvent Green Hackathon here. For more on the hackathon, you can also take a look at our coverage of the event here.
➤ NYC Green Data, by MapBox
See more here: Take a look at this gorgeous, interactive map of NYC’s green data

“I believe CrunchBase will gain a lot of attention from the academia soon, which is always eager for high-quality data set,” writes Guang Xiang of Carnegie Mellon University, who found that he could predict Mergers and Acquisitions much better using the unique business variables available in CrunchBase than the traditional databases used by academics. Thanks, Xiang, flattery will get you everywhere.
“Traditionally, people only used numeric variables/features for M&A prediction, such as ROI, etc. CrunchBase and TechCrunch provided a much richer corpus for the task,” he writes. Specifically, CrunchBase gave him data on a volume of companies roughly 43 times the normal dataset (2300 vs. +100,000) and access to valuable variables, such as management structure, financing, and media coverage.
For instance, “Strong financial backing is generally considered critical to the success of a company,” but traditional datasets won’t have detailed information on the management, their experience, and the funding rounds.
Even better, the news coverage itself on Techcrunch could also be a predictor of merger or acquisition (because, well, duh, if a company’s doing well enough to make the news, there’s a good chance someone is also itching to buy it out).
But, just when we were starting to blush, Xiang brought out the criticism, “Despite its large magnitude, the CrunchBase corpus is sparse with many missing attributes,” because the community-created database tends to focus on more popular companies and features. That said, even with drawbacks, the researchers still achieved “good performance,” with CrunchBase — Which impressively enough has been managed all these years by superwoman Gene Teare.
M&A activity is just the tip of the iceberg, and there are all sorts of business questions that could be answered using the vast amounts of data provided by CrunchBase. So, statisticians and business analysts, go nuts. And, when you find something cool, let us know first (tips@techcrunch.com).
More here: Thanks, Science! New Study Says CrunchBase Is An Information Treasure Trove
Home | About Networld | Checkout | Shopping Cart | Contact Networld
Copyright Networld Interactive.com © 2009-2012. All Rights Reserved.
Designed by Networld Interactive.