Wednesday, May 28, 2014

Maria-Esther Vidal and Elena Simperl join JWS editorial board

The Journal of Web Semantics welcomes Drs. Maria-Esther Vidal and Elena Simperl as new members of its editorial board.

Dr. Maria-Esther Vidal is a full-professor in the Computer Science Department of the Universidad Simón Bolívar (Caracas, VE) where also serves as the Assistant Dean for Research and Development in Applied Science and Engineering. Dr. Vidal leads USB Semantic Web Group and has research interests that include publishing and consuming (linked) open data, query rewriting, optimization and execution, ranking, link prediction, and benchmarking and evaluation. She received a Ph.D. in Computer Science from USB in 2000.


Dr. Elena Simperl is a senior lecturer in the Web and Internet Science Research Group at the University of Southampton (Southampton, UK). Her research is in the intersection of semantic technologies and social computing and includes interests in the socially and economically-motivated aspects of creating and using semantically-enabled Web content and how it can be used to foster collaboration and participation. She received a Ph.D. in Computer Science from the Freie Universität Berlin in 2006.

Monday, May 19, 2014

Preprints of papers from the 2012 Semantic Web Challenge


Preprints from the JWS special issue on the 2012 Semantic Web Challenge are available on the preprint server.

Thursday, May 15, 2014

JWS preprint: Querying NeXtProt Nanopublications and their Value for Insights on Sequence Variants and Tissue Expression

New paper on the Journal of Web Semantics preprint server:

Christine Chichester, Pascale Gaudet, Oliver Karch, Paul Groth, Lydie Lane, Amos Bairoch, Barend Mons, Antonis Loizou, Querying NeXtProt Nanopublications and their Value for Insights on Sequence Variants and Tissue Expression, Web Semantics: Science, Services and Agents on the World Wide Web, to appear, 2014.

Abstract: Understanding how genetic differences between individuals impact the regulation, expression, and ultimately function of proteins is an important step toward realizing the promise of personal medicine. There are several technical barriers hindering the transition of biological knowledge into the applications relevant to precision medicine. One important challenge for data integration is that new biological sequences (proteins, DNA) have multiple issues related to interoperability potentially creating a quagmire in the published data, especially when different data sources do not appear to be in agreement. Thus, there is an urgent need for systems and methodologies to facilitate the integration of information in a uniform manner to allow seamless querying of multiple data types which can illuminate, for example, the relationships between protein modifications and causative genomic variants. Our work demonstrates for the first time how semantic technologies can be used to address these challenges using the nanopublication model applied to the neXtProt data set, a curated knowledgebase of information about human proteins. We have applied the nanopublication model to demonstrate querying over several named graphs, including the provenance information associated with the curated scientific assertions from neXtProt. We show by the way of use cases using sequence variations, post-translational modifications (PTMs) and tissue expression, that querying the neXtProt nanopublication implementation is a credible approach for expanding biological insight.

JWS preprint: Querying a Messy Web of Data with Avalanche, Basca and Bernstein


New paper on the Journal of Web Semantics preprint server:

Cosmin Basca and Abraham Bernstein, Querying a Messy Web of Data with Avalanche, Web Semantics: Science, Services and Agents on the World Wide Web, to appear, 2014.


Abstract: Recent efforts have enabled applications to query the entire Semantic Web. Such approaches are either based on a centralised store or link traversal and URI dereferencing as often used in the case of Linked Open Data. These approaches make additional assumptions about the structure and/or location of data on the Web and are likely to limit the diversity of resulting usages.

In this article we propose a technique called Avalanche, designed for querying the SemanticWeb without making any prior assumptions about the data location or distribution, schema-alignment, pertinent statistics, data evolution, and accessibility of servers. Specifically, Avalanche finds up-to-date answers to queries over SPARQL endpoints. It first gets on-line statistical information about potential data sources and their data distribution. Then, it plans and executes the query in a concurrent and distributed manner trying to quickly provide first answers.

We empirically evaluate Avalanche using the realistic FedBench data-set over 26 servers and investigate its behaviour for varying degrees of instance-level distribution "messiness" using the LUBM synthetic dataset spread over 100 servers. Results show that Avalanche is robust and stable in spite of varying network latency finding first results for 80% of the queries in under one second. It also exhibits stability for some classes of queries when instance-level distribution messiness increases. We also illustrate, how Avalanche addresses the other sources of messiness (pertinent data statistics, data evolution and data presence) by design and show its robustness by removing endpoints during query execution.