Entries Tagged 'tech' ↓
Found on ProgrammableWeb authored by Kevin Farnham
November 21st, 2008 — tech
Hoover’s, the business information company, has released their own Hoover’s API, aka “HAPI”. This new API enables developers to create applications that utilize Hoover’s database of 27 million companies and 34 million business executives (details at our Hoover’s API profile). This is certainly useful data for a variety of enterprise applications.
Hoover’s VP of Business Development Heidi Tucker gave a presentation titled “Hoover’s - API Strategy - Open Access to Business Intelligence” at Mashery’s recent Business of APIs Conference in San Francisco, CA. Heidi highlighted several current implementations of the Hoover’s API:
- CRM application with “company look up and list build” and “data enrichment upsell”
- Address verification: Shipping department checks address info against Hoover’s to reduce bad address non-deliverables
- Outbound telemarketing: Hoover’s info populates predictive dialer contact info
- Web analytics publisher: Users see traffic and corresponding basic business information
Interestingly Heidi outlined some of the risks involved in opening up this valuable data via an API
- Keys to the Kingdom
- Search Results Relevancy
- Scalability
- Cannibalization
- Brand Compliance
- Reputational Risk
- Pricing
Technically, the Hoover’s API is SOAP-based with data returned in XML. There is a free testing API key; production API keys are available for a fee. Pricing of the API can be done as: per user per month; per API call; or flat content license fee.
The API is well-documented. The WSDL is available for review, and the development team provides sample code for .NET, Java, and PHP developers. The Hoover’s API Developer Blog and Hoover’s API Forums are available.
As more and more of this class of valuable business data becomes available via open APIs we should expect to see new interesting enterprise mashups appearing soon.
Share This
Found on Mashable! authored by Sean P. Aune
June 6th, 2008 — tech

Adobe’s popular AIR runtime is gaining more and more fans, and with that, far more applications than ever that cover a broad spectrum of tools. From fun applications that let you order pizza from your desktop to applications that let you track your investments online and off , the entire spectrum is out there, and this guide should help you find at least one or two that fit your life.
Financial and Productivity
Agile Agenda - Helps with project management, letting you schedule and assign tasks, assigning priorities and more.
Google Calendar RSS Invoice Creator - If you use Google Calendar to track your work on projects, you can use this app to parse your RSS feed and create invoices for your time to send to your clients.
Klok - A tool for freelance workers to manage their billable hours to clients. Also gives you tools for visualizing what you spend the majority of your time on, managing your projects and more.
NASDAQ Market Reply - A powerful tool allowing you to look over the history of any security and study its trends.
Portfolio Viewer - View multiple investment portfolios, and you can view them offline as all information is stored locally.
StockQ - Start streaming stock prices to your desktop without having to belong to any other sites. Set refresh intervals from 1 sec to 5 minutes, you have complete control.
Wicked Stickies! - Make post-it-like stickers where you can put down notes to help with your GTD activities, even set them with an expiration date so they won’t clutter up your desk.
(more...)
---
Related Articles at Mashable! - The Social Networking Blog:
Adobe Building Office Apps to Go After Microsoft?
Adobe Launches Updated Flash Player 9: Moviestar
Adobe Apollo Turns Into AIR and Gets New HTML Capabilities
Adobe Can’t Fix Security Flaw Until End of October
Adobe Illustrator Integrated with Del.icio.us
Adobe’s Web-Based Photoshop Express On Its Way
Adobe to Acquire Makers of Web-Based Word Processor Buzzword

Found on ReadWriteWeb authored by Sarah Perez
May 14th, 2008 — tech
Sometimes you stumble across something that really makes you say "wow" and reminds you that there's so much more to this internet thing than just the latest web app. Case in point is this article describing some of the visual resources available on the web. The deep web. These images won't show up in search engines' image searches or on Flickr (save one exception), but instead can only be accessed via the links below.
The images are a part of online collections created by institutions in the U.S. Some of the images may be a part of the public domain, but many will require permission or accreditation in order to use. So, no, these aren't necessarily images you can use in your next blog post, but that doesn't mean they're not useful. Instead, if given permission, these images could be used in the classroom, in private study, or even included in a media project or publication.
Collaborative digital collections
- Alabama Mosaic: Thousands images that can be searched by keyword. Images are from historical collections featureubg content from libraries, archives and museums from across Alabama.
- Alaska Digital Archives: More than 5,000 quality digital images of Alaska's heritage in a searchable online database.
- Calisphere: A free online collection of more than 150,000 digitized primary materials contributed by libraries, archives, and museums from all over California. Search for content by keyword, by browsing the alphabetized subject list and exploring theme collections, such as the Gold Rush Era and World War II. Lesson plans are also available for elementary and secondary schoolteachers.
Calisphere
- Library of Congress American History and Culture Collections: These collections began as a pilot project in 1990 to provide middle school as well as high school teachers and students with digital surrogates of collection material on CD-ROM. Over the years, the collection has become a "National Digital Library" with diverse institutions from all across the United States contributing content. Search or browse alphabetized subject lists, time periods, and geographical locations. American Memory Historical Collections features more than 100 thematic subjects ranging from advertising to maps to women's rights.
- Library of Congress International Collections: Access content from American Memory Historical Collections as well as international visual resource collections, such as the Abdul Hamid II collection of photographs of the Ottoman Empire and the Prokudin-Gorskii collection of photographs of the Russian Empire. Additionally, through partnerships with national libraries in other countries, you can access collections that highlight the history of the United States in relation to other nations, such as "France in America" and "The Meeting of Frontiers: Siberia, Alaska and the American West."
- University of Washington Digital Collections: Access to tens of thousands of digital images covering a wide variety of subjects, but with an emphasis on the Pacific Northwest. The digital collections include image-heavy resources, such as the J. Willis Sayre Photographs of actors, vaudeville performers, and movie stills; the Washington Women's History Consortium Fashion Plate Collection; the Dearborn-Massar Photographs of Architecture; and the Seattle Photographs Collection.
- Photomuse: A research resource for the history of photography. Features online exhibitions, a chronology of the evolution of photography complete with visuals and historical information, as well as an image database.
Photomuse
University digital image collections
- Duke Digital Collections: Featured collections are freely available on the Internet and include the Emergence of Advertising in America, Ration Coupons on the Home Front (1942-1945), and the 50,000 item William Gedney Photographs and Writings collection.
- Yale University Library Digital Collections: More than 100,000 digital images are searchable and viewable by the public.
- Harvard University Library: A Selection of Web-Accessible Collections: A list of visual resource collections that are unique to Harvard University, but reside in different repositories on the Harvard campus. Collections include the Harvard Daguerreotype Collection, the Hedda Morrison Photographs of China, Immigration to the United States (1789-1930), Legal Portraits Online, and the Latin American Pamphlet Digital Collection.
Harvard
Digital image collections at public libraries and archives
- Historical Photograph Collections at the Arizona State Archives: 33,000 digital images of primary materials from the historical photograph collections. Most of the photographs available through the public online database date to before 1940 and include examples of all types of photographic processes, including tintypes, glass lantern slides, and photographic postcards.
- Library of Congress Prints and Photographs Online Catalog: Get access to more than 1 million digital images via one of the largest digital image databases in the world. Search for images by keyword, by browsing lists of alphabetized subjects, or by choosing a collection and looking through individual image records.
- Los Angeles Public Library: More than 60,000 images featuring the work of many notable photographers active in the Los Angeles area over many decades, including some contemporary photographers. Search by keyword or photographer.
- New York Public Library Digital Gallery: One of the largest open-access image databases available on the Internet featuring more than 600,000 digital images, including all kinds of primary materials, such as manuscripts, maps, photographs, prints, restaurant menus, sheet music covers, and much more.
NY Public Library
Digital image collections at historical societies
- Indiana Historical Society: An extensive collection, covering topics ranging from architecture to railroads to sporting events.
- Wisconsin Historical Society: A visual resource for Wisconsin history containing 35,000 photographs. Of special interest is the Wisconsin Historical Museum's Children's Clothing Collection where visitors may browse images of more than 2,000 articles of children's clothing dating back to the 18th century.
Other
Library of Congress
You can learn more about the history of these collections and get details on how to search them from the article here.

Found on TechCrunch authored by Michael Arrington
May 11th, 2008 — tech

Today marks another milestone for San Francisco based contextual search engine Powerset. They’ve launched a showcase for their user search experience - effectively the search engine minus the web crawl. For now, Powerset queries only Wikipedia and augments results with data from Freebase. The product launch comes just a day after reports that the company is being shopped to potential buyers by investment bank Allen & Co.
I have been able to test Powerset via their labs site for the last few weeks. I wrote about it last month, and the version that just launched is very similar.
There is no way to look at Powerset today and determine if it can be as disruptive to search as Google was when it launched almost a decade ago. That’s because it only queries Wikipedia, and so there is little need for proper ranking algorithms to sort the good from the bad results.
But what user can see is how effective a way it is to gather information quickly. For someone doing research, Powerset effectively removes a number of steps towards getting to the final information. It is particularly effective when the information needed is on many different web pages.
For example, a query on Powerset of “when did earthquakes hit tokyo” yields stunning results. Try this query at Google or even wikipedia to compare - instead of just picking out keywords that are in your query and on a web page, Powerset is actually making some sense of the content included in the wikipedia pages:

The way that Powerset returns queries means that answers are often found in the result snips, as above. They are also structuring a lot of the Wikipedia and (and already structured Freebase) data and inserting it into results. So a search for “Bill Clinton” shows results, but also shows Freebase structured data along with additional query refinements to get to more information. The important thing below isn’t the structured data in the results, its the fact that you can click on the action words and drill down into very specific queries (to find, for example, what bills he signed, or which Supreme Court justices he nominated, or who he slept with).

Powerset is indexing web pages much differently than normal search engines, which generally just record content to match against keyword queries. Instead, Powerset is trying to understand the content on the page so that it can be matched meaningfully to queries later. Even queries that don’t use matching words.
Indexing the web is expensive, though, and Powerset’s way of doing it requires even more time and computing power dedicated to a web page. That’s why they say they aren’t indexing the entire web yet - the company has raised just $12.5 million (plus another $8 million or so in bridge loans from investors). To index the web will require a new round of financing (see the first paragraph above about their sale/financing efforts).
Powerset is has taken a lot of criticism for their goal of trying to redefine how people search the web (including from us). But their lofty goals are what makes Silicon Valley so great - succeed or fail, Powerset is trying to do something pretty spectacular.
The company has also created a demo overview video - see below.
Crunch Network: CrunchGear drool over the sexiest new gadgets and hardware.

Found on TheServerSide.com: News authored by Jevgeni Kabanov@nospam.com
April 30th, 2008 — tech
ZeroTurnaround has announced the final release of JavaRebel 1.1. JavaRebel is a JVM plugin (-javaagent) that enables reloading changes made to Java class files on-the-fly, saving developers the time that it takes to redeploy an application or perform a container restart. In addition to changes like the provision for dynamic proxies, full SDK availability, and full class reloading, ZeroTurnaround is offering a free license for bug reports.

Found on ProgrammableWeb authored by Raymond Yee
April 29th, 2008 — tech
As the national library of the United States, the Library of Congress has created vast amounts of metadata to describe books and other documents in its collection. Among this metadata is the Library of Congress Subject Headings (LCSH), a “controlled vocabulary” for classifying documents by subject. In order words, experts at the Library of Congress have come up with a (large) list of subject headers from which catalogers of documents can choose. As an example, if you look at the Library of Congress record for Tim Berners-Lee’s book Weaving the Web, you’ll that it is classified under “World Wide Web“, specifically “World Wide Web–History“.
Since the Library of Congress isn’t the only entity that classifies documents, you can imagine that other entities (and not just libraries) would interested in reusing the LCSH vocabulary. But how should the Library of Congress make LCSH available so that it can be easily reused?
That’s where the recent release of lcsh.info comes in (see also the lcsh.info ProgrammableWeb Profile):
This is an experimental service that makes the Library of Congress Subject Headings available as linked-data using the SKOS vocabulary. The goal of lcsh.info is to encourage experimentation and use of LCSH on the web with the hopes of informing a similar effort at the Library of Congress to make a continually updated version available. More information about the Linked Data effort can be found on the W3C Wiki.
Let’s look at what you can do with lcsh.info through a couple of examples. First, we return to the subject heading World Wide Web, this time accessible from lcsh.info as
http://lcsh.info/sh95000541
Note the form of the URL: http://lcsh.info/{lccn} where lccn refers to the Library of Congress Control Number (LCCN), an identifier of the subject heading. In this case, the LCCN for World Wide Web is sh95000541.
If you drop this URL into your browser, you’ll get the default format or representation of the information lcsh.info has about the World Wide Web subject header, including:
The diagram below illustrates some of these relationships
To facilitate reuse of the data, lcsh.info offers its data a variety of formats that can be accessed via content negotiation. That is, you use the Accept HTTP header to specify which of the following content type you want:
- XHTML (with embedded RDFa), which is the default value (application/xhtml+xml)
- JSON (application/json)
- RDF/XML (application/rdf+xml)
- N3 (text/n3)
For example, you can use curl to get JSON representation of the World Wide Web subject header:
curl -v -L -H “Accept: application/json” http://lcsh.info/sh85062913
By looking at the RDF/XML and N3 representations, you can see a concrete example of semantic web approaches to express notions of broader, narrower, and related terms as well as alternative labels using
- Simple Knowledge Organization System (SKOS), which is “a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other types of controlled vocabulary”
- designs rules for linked data to represent the network of interconnected subject headings
This experimental but promising service may soon pave the way for full production level web services from the Library of Congress.
Share This

Found on InfoQ Personalized Feed for Glenn Marcus authored by Robert Bazinet
April 18th, 2008 — tech
Developers today are constantly creating applications that consume services of other web sites. Consuming these services requires figuring out and understanding the sometimes complex Application Programming Interfaces (APIs). By Robert Bazinet
Found on ReadWriteWeb authored by Sarah Perez
April 9th, 2008 — tech
Today, a story on Techmeme caught our eye. It was entitled "We Need a Wikipedia for data," and the article, written by X-Googler Bret Taylor, discussed the difficulty of finding open data sets on the internet, something which could spur innovation, allowing programmers to build new applications the likes of which have never been seen before. What was interesting about this story, in addition to, obviously, the concept of a Data Wiki itself, was the amazing and insightful commentary around this concept, not just on the blog, but all over the net, something which led to the discovery of some pretty good data sources that are already available.
In Bret's story, he mentioned some of the common data sources currently available, like the US Census Bureau's map data and the Reuters corpus, but his commenters came up with a few more. (See? This is why blog comments matter).
In addition, as CNet and Ryan Stewart's blog spread the story, more people chimed in with suggestions. And of course, the Hacker News guys had some more ideas themselves.
So what did everyone come up with? A lot of data sources are already freely available on the net, as it turns out, if you just know where to look. Here's a summary, do you have anything to add?
CKAN (Comprehensive Knowledge Archive Network)
The CKAN site is a registry of open knowledge packages and projects. Here, you can find open knowledge resources or register one of your own. What kind of stuff can you find at CKAN? They mention a set of Shakespeare's works, a global population density database, the voting records of MPs, or 30 years of US patents as some examples, but they also point you to some useful URLs, like flickr's Creative Commons page, where photos can be searched by license type.
CKAN
Infochimps.org
This project is attempting to assemble and interconnect the world's best repository for raw data - like a giant, free, open almanac. The best way to describe it comes from MetaFilter, where the project was spotted recently: "Just as Wikipedia will help you find out something about everything, infochimps.org will help you find out everything about something." What can you find there? Every wikipedia infobox, each infobox type in its own table, 50 years of global hourly weather data, all the tables from the US Census Statistical Abstract, oh and 100,000 official crossword words, too.
Infochimps.org
OpenStreetMap
Not a data set in the traditional sense, but definitely a useful tool, OpenStreetMap is a free, editable map of the world where you can view, edit, and use your own geographical data. The project was started because most maps actually have legal or technical restrictions on their use.
OpenStreetMap
MusicBrainz
A user-maintained community metadatabase site which collects music "metadata" like artist name, release title, list of tracks, etc. You can browse through the site or you can use a client program, like their own taggers, to help identify music collections.
Musicbrainz
Jigsaw
Dismissed by the blogosphere as a bad idea, if not downright evil, Jigsaw, the marketplace that pays you to give up other people's contact info now boasts 7 million complete contacts for the taking.
DBpedia
This site is a community effort to extract structured info from Wikipedia and make that data publicly available on the web, essentially turning Wikipedia into a database you can query. Is this the beginnings of a semantic web? Check out their downloads section for the datasets and then scroll to the bottom for even more links to data sources on the web.
DBpedia
flickr wrappr
Where DBpedia takes Wikipedia and makes it semantic, flickr wrappr extends DBpedia with RDF links to photos posted on flickr. Here's an example. Here's another. This is pure geek hotness.
Freebase
Freebase, an open, shared database of the world's knowledge, received a lot of mentions in the comments, so this must be a good one. Community built and maintained, it pulls from open data sources like Wikipedia, MusicBrainz, and the SEC archives to create structured information on many topics, including more popular ones like movies, music, people, and locations. The site, unlike some of the others in this list, is also easy to navigate and well-designed, which makes it that much better to use.
Freebase
Opentick
Perhaps one of the less interesting items due to its dry subject matter - financial data - it's certainly worth a mention because a free database of real-time and historical market data for trading systems and platforms is the kind of thing that really floats some people's boats.
ThingISBN
Thanks to LibraryThing, ThingISBN is the site's first API, and even though its competitor became a paid service, ThingISBN is still free for non-commercial use. The API doesn't just return the usual book data, but also something called "edition disambiguation," meaning it also returns a list of "related" ISBNs—other editions, other media, and translations.
Numbrary
Like the title suggests, Numbrary is a library for numbers. This free service helps you find, use, and share numbers from public record data sets, like census data or the CIA World Factbook.
Numbrary
theinfo.org
This site isn't just a place to build or collect data sets, of which they have quite a nice list, but a place where you can interact with other number-lovin' folks like yourself.
theinfo.org
The Data Wrangling blog
This blog post lists a bunch, and I mean a bunch, of open datasets on the web, which just goes to show how much of a cursory list my post really is.

Found on Gizmodo authored by Adam Frucci
March 31st, 2008 — tech
How dedicated are you to using a Bluetooth microphone with your phone? Are you dedicated enough to drill a small hole in your teeth to install a tiny mic? Well, if so, here's one for you. Hit the jump for a picture of it in-mouth and a word of warning about DIY dentistry.

The durable composite resin filling is designed to fit in a hole 2.2mm in diameter and 1.7 mm deep and will pick up sound and vibrations from your mouth to produce incredibly clear sound.
I don't know about you, but I think I'd rather stick with a regular Bluetooth headset, especially when this thing still requires you to wear something in your ear so you can hear what's going on. But hey, it's up to you. And as Chinavasion, the seller, reminds you, don't go drilling holes in your teeth yourself. "All dental work should be performed by a qualified dentist, Chinavasion does not take responsibility for injury resulting from the installation of this product." Yikes. [
Product Page via
Geek Alerts]

Found on ReadWriteWeb authored by Charles Knight, AltSearchEngines editor
March 7th, 2008 — tech
This post was syndicated from Alt Search Engines, our alternative and niche search engines blog. Editor's Note: the style of this post is different to what we'd normally do here, but we think the technology is interesting enough to run the post as-is.
eeggi (engineered, encyclopedic, global and grammatical identities) is the world’s first mathematically-based Search and Retrieve, Response, and Discovery engine (ReDi engine), capable of focusing on the concept of text and not just the text itself.
A ReDi engine is a new type of engine capable of not only searching and retrieving information but also responding to direct questions (in what country did Napoleon died?) and discovering data through pure rationality, which is possible thanks to our new technological breakthrough of Relational Intelligence.
Relational Intelligence (RI) is a new informational platform and network that implements a series of new algorithims, processors (or computers) and state of the art eeggis to produce a new type of machine Intelligence which is specifically designed to process concepts, their retrieval and/or their rational discovery. With substantial differences to that of current Artificial Intelligence (AI), RI opens new horizons on information retrieval and processing. For example, once eeggi is taught or discovers that “Mary” is a girl, all the attributes relative to a female human become available and/or distinguishable, thus allowing eeggi to retrieve and respond to all sorts of questions about “Mary” -the girl-.
Why use eeggi?
Because it is immensely more efficient, robust, responsive and comprehensive than any text-based technology.
A text-based engine gives no attention to meanings; as a consequence, it promotes the following problems:
a) Limits findings to the text itself crippling results, and inventory (avoids other equally meaningful data, synonyms, etc.),
b) Allows for numerous irrelevant hits.
On the other hand, instead of just finding the text, eeggi focuses on the meaning behind the text, avoiding the problems above by retrieving all “equally meaningful” data, not crippling inventories, and reducing irrelevant hits. In addition, eeggi is the only engine in the world capable of grouping results by meaning (words with multiple meanings, such as: Right = correct; Right = turning). In short, eeggi saves users’ time and enhances sales.
A Text-based search engine (technology used to search the Internet) finds only the text, exactly as entered, ignoring the concept of the query, thus retrieving millions of irrelevant results while treating words such as “photo” and “photograph” as if they meant different things; and words such as “light” (radiation) and “light” (weight) as if they meant identical things. But eeggi implements Relational Intelligence for retrieving results based on concept and to respect the words’ proximities and relationships.
Furthermore, eeggi permits questioning, such as entering “Where did Napoleon die?” to obtain a single compiled response or… “St. Helena” (not thousands of results), providing superior, conceptually matching, lesser but more appropriate results, with either very little or no irrelevance whatsoever.
With eeggi, searches can:
* Include synonyms (other words with the same meaning),
* Manipulate similarities (words such as “pretty” and “gorgeous;” yet respecting their conceptual intensities),
* Automatically organize results based on the word’s concept (text with several meanings),
* Reduce irrelevance (allowing very specific and detailed queries)
* Become multi-lingual (handle several languages simultaneously)
* Find conclusive and/or deductive results (other information native to deductive intelligence)
* Respect Directional Conceptuality (avoids inverted phrases and sentences)
* Utilize search controls (user can manipulate search magnitude and behavior)
* Respond to questions
Will eeggi search the Internet?
Yes, eeggi was designed to replace text-based technology, and to surpass language barriers effortlessly. Providing more sensible, organized, comprehensive, and conceptually meaningful results, eeggi is the optimum Internet Search engine.
I encourage you to visit the Demo for a trial of eeggi versus text.
To see an example of how an Internet Search engine is limited by its own text please click here.
