no text exists for this slide
Semantic Web Scalers
Expose whatever there is as RDF, the next guy will unify terms, make search and apps
Data Warehouse Keepers
Data is spread out, has implicit semantics, complex schemas, heterogeneous sources, ambiguous terms but we must make it join and aggregate cleanly
SPARQL to SQL exists but still, complex integrations are data warehouses
We'd really like to map, but...
Can it be otherwise?
Pros
Pros
Even query performance across all data
Possibility of forward-chaining inference
Some SPARQL features may be better supported, e.g. Unspecified predicates
Cons
Keeping data up-to-date
Complex set up, needs dedicated servers: you don't build them on a whim
No copying, no timeliness issues
RDBMS outperforms RDF for analytics workloads
Agile reconfiguration without reloading data
Mapping of SPARQL to SQL against any existing schema - whether stored in Virtuoso or elsewhere
Mapping of SPARQL to SQL against any existing schema - whether stored in Virtuoso or elsewhere
Physical quad store
Federated/local RDBMS
Tackle any SQL analytics workload in SPARQL without extra cost
Tackle any SQL analytics workload in SPARQL without extra cost
Deal with arbitrary SQL schema
Produce single SQL statements, optimizable by target RDBMS
Have intelligence for cases where one RDF entity can come from many relational sources
Bring similar but heterogeneous schemas into a unified ontology - Union View
Bring similar but heterogeneous schemas into a unified ontology - Union View
Translate FKs of one schema to PKs in another - Distributed Join
Hide differences in normalization - Views for hiding joins
- Unit/Terminology conversions
Define URI formats and their subclass relations
Define URI formats and their subclass relations
Define which key-column-value combinations make a triple
Arbitrary SQL is allowed for mapping values and filtering
A single RDF node can be a composite of many columns, e.g. multipart key
The 22 queries as extended SPARQL
Each generates a single SQL statement, executable by Virtuoso, Oracle, Others
Next make several TPC-H databases on different servers and run the queries against the union
In OpenLink Data Spaces, 6 Collaborative apps all mapped to SIOC:
In OpenLink Data Spaces, 6 Collaborative apps all mapped to SIOC:
Trivially becomes a union of everything, 1000+ lines of SQL
Intelligently (once per app) becomes a Union of :
Mapping for integration is not trivial
Mapping for integration is not trivial
Be careful when mapping multiple tables/columns to one class/property
Make URI schemes which encode type and source, so that senseless joins are not attempted if types not specified in query
Understand what the mapping logic can and cannot optimize
Understand what SQL can and cannot optimize
View resulting SQL for sanity check
Mapping must work against any RDBMS/Schema, as is
Mapping must work against any RDBMS/Schema, as is
But there is Virtuoso SQL between the mapping and target RDBMS(s)
Location and latency - conscious distributed cost model
Breakup for making a wide result set into a row per property
Inverse functions
OpenLink Data Spaces - Blog, Wiki, News, Social Network, Feed Aggregation, Tag Clouds, Bookmarks etc.
OpenLink Data Spaces - Blog, Wiki, News, Social Network, Feed Aggregation, Tag Clouds, Bookmarks etc.
OpenLink's own MIS - âtotal information awarenessâ: URI for any CRM Object, Account, Product, Support Case, Email etc..
Musicbrainz
phpBB, Drupal, MediaWiki, WordPress, Bugzilla, and others.
no text exists for this slide
Great slides
Posted By: Vancouver Printing On: 06/13/10 11:35 PM