I’m back!

Earlier I have described the OpenCalais Web Service.

The ecosystem of web services

The NASA Earth Observatory Glossary defines an ecosystem as “any natural unit or entity including living and non-living parts that interact to produce a stable system through cyclic exchange of materials” [NASA]. The concept can be applied to Internet-based applications that function as information-consuming or information producing “organisms” and that interact with each other in an interdependent way through exchange of information.

The IBM web site, on the other hand, defines “web services” as “self-contained, modular, distributed, dynamic applications that can be described, published, located, or invoked over the network to create products, processes, and supply chains.”

As discrete, possibly autonomous “organisms” in an Internet-based information ecosystem, web services-enabled applications expose data and/or service end points in multiple ways including Really Simple Syndication (RSS) feeds, and web services Application Programming Interfaces (APIs) using Simple Object Access Protocol (SOAP), XML Remote Procedure Call (XML-RPC) or REpresentational State Transfer (REST). Aside from the use of XML to embed data in responding to data or process requests, an increasing number of web service applications also provide responses using Javascript Object Notation (JSON). OpenCalais and Alchemy use Resource Description Framework (RDF), an XML-based semantic web format that structures data as triples (subject, predicate, object), to respond to API requests and both perform named entity disambiguation by linking to external knowledge bases (e.g., CIA Factbook, Wikipedia, Freebase). These web service applications may even provide machine learning-based services such as natural language processing (specifically named entity extraction and concept annotation), language detection and translation and text classification. Tools that enable semantic processing of content (not just classification) potentially allow exposing richer knowledge-based content embedded in unstructured data such as news about outbreaks and disasters.

Examples:

  1. Structured data feeds: RSS, Keyhole Markup Language (KML), JSON, RDF triples
  2. Unstructured data transformation: Dapper, Yahoo Pipes
  3. Geocoding Services: Yahoo Maps, Google Maps, Geonames
  4. Named Entity Extraction: OpenCalais, Alchemy
  5. Concept Annotation: UMLS Knowledge Source Server
  6. Visualization: SIMILE Exhibit
  7. Text Classification: uClassify
  8. Language Detection and Translation: uClassify, Google TranslateAlchemy
  9. Text cleaning: Alchemy
  10. Entity Disambiguation Using Linked Data: OpenCalaisAlchemy

Major challenges to creating and leveraging a web services ecosystem

Updating multiple interfaces. As these web service APIs are in rapid development keeping track of upgraded APIs to enable compatibility and alignment of agent interfaces is a challenge. However, most web service providers keep their users informed of upgrades or changes through listservs, blogs, social networking media (e.g., Facebook or Twitter). The design and development of self-configuring, self-healing agent interfaces that adapt to changes in web service APIs therefore become important.

Sustainability of API provider. Although many web service API providers are companies that have dual modes for providing services (commercial and free), there are also companies that have disappeared because of lack of a sustainable model. The sustainability model impacts the quality of the service being offered. For example, to distribute data and information through Short Messaging System, a strategy one company implemented is to send short advertisements together with data. If SMS has a 140-character message width and this width is split into two, leaving 70 characters, one would be hard put to make use of the remaining space for message or information distribution.

User and Developer community engagement. A few web service API development projects are in the open source domain. In these web service API development projects, the lack of engaged user and developer communities can hamper product improvements and service quality. In addition, without communities to shepherd different but critical components of a web service project through different stages of maturity may mean components of that project may lag behind in development leading to an incomplete product or service offering. A number of web service API providers have engaged their user and developer communities to collaborate and provide critical feedback and ideas for innovation through “crowd-sourcing” in web service API development projects, open source or not. Through this engagement, we see rapid improvements in the quality of web service APIs and more feature rich service offerings. By engaging the user communities, web services providers are able to gather end-user requirements in real-time and incorporate these requirements within short iteration periods. By engaging the developer community, the project becomes more sustainable such that even if the chief developer moves on, bustling developer and user communities can keep the project going.

Promising developments

When EpiSPIDER began in 2005, there were only a handful of web services available to connect and outsource “business processes” to. Most of the “business process” gaps, like natural language processing and part-of-speech tagging, were filled by developing “in-house” code to implement these processes. There was an indication things would get better when suddenly there were three mapping APIs to use – Google Maps, Yahoo Maps and Microsoft Virtual Earth – since the first EpiSPIDER prototype sported a Scalable Vector Graphics map interface.

Today, EpiSPIDER enjoys the availability and company of a plethora of web services it can leverage (those mentioned above). The use of data exchange standards (XML, RSS, RDF, JSON) have become commonplace in currently available APIs and developers over time have made better sense of how these standards need to come together and enable “interoperability” among different “organisms” of the web services-based information ecosystem. Web services APIs currently in existence and under development herald a new era where previously esoteric processes have now been “commoditized” that one can think of having veritable web services “building blocks” to create information processing pipelines.

And in the current climate where transparency, accountability and data sharing are encouraged, web services are there to make this climate even more valuable by enabling processing of information from distributed data repositories and recombining content from these repositories to come up with a new “whole that is greater than the sum of its parts.”

Trackback

only 1 comment untill now

  1. Thank you for writing such an honest article!

Add your comment now

Get Adobe Flash playerPlugin by wpburn.com wordpress themes