I’m back!
Earlier I have described the OpenCalais Web Service.
The ecosystem of web services
The NASA Earth Observatory Glossary defines an ecosystem as “any natural unit or entity including living and non-living parts that interact to produce a stable system through cyclic exchange of materials” [NASA]. The concept can be applied to Internet-based applications that function as information-consuming or information producing “organisms” and that interact with each other in an interdependent way through exchange of information.
The IBM web site, on the other hand, defines “web services” as “self-contained, modular, distributed, dynamic applications that can be described, published, located, or invoked over the network to create products, processes, and supply chains.”
As discrete, possibly autonomous “organisms” in an Internet-based information ecosystem, web services-enabled applications expose data and/or service end points in multiple ways including Really Simple Syndication (RSS) feeds, and web services Application Programming Interfaces (APIs) using Simple Object Access Protocol (SOAP), XML Remote Procedure Call (XML-RPC) or REpresentational State Transfer (REST). Aside from the use of XML to embed data in responding to data or process requests, an increasing number of web service applications also provide responses using Javascript Object Notation (JSON). OpenCalais and Alchemy use Resource Description Framework (RDF), an XML-based semantic web format that structures data as triples (subject, predicate, object), to respond to API requests and both perform named entity disambiguation by linking to external knowledge bases (e.g., CIA Factbook, Wikipedia, Freebase). These web service applications may even provide machine learning-based services such as natural language processing (specifically named entity extraction and concept annotation), language detection and translation and text classification. Tools that enable semantic processing of content (not just classification) potentially allow exposing richer knowledge-based content embedded in unstructured data such as news about outbreaks and disasters.