semantic web

Web 2.0 executive summary

Thursday, April 2nd, 2009 | semantic web, web | 2 Comments

Neat applet from IBM, Wordle generated the following output from this blog’s RSS feed.

Update: Regenerated to address stattos’s concerns about the prescence of banks at the centre of the old one. I think this one might make a nice t-shirt image.

atlanticlinux-wordle

Tags: ,

Linux and the Semantic Web

Saturday, March 28th, 2009 | galway, hardware, linux, semantic web | No Comments

I’ve recently (well, back in January, but it took me a while to blog about it) started working with the DERI Data Intensive Infrastructure group (DI2). The Digital Enterprise Research Institute (DERI) is a Centre for Science, Engineering and Technology (CSET)  established in 2003 with funding from Science Foundation Ireland. Its mission is to Make the Semantic Web Real – in essence, DERI is working on both the theoretical under-pinnings of the Semantic Web as well as developing tools and technologies which will allow end-users to utilise the Semantic Web.

The group I’m working with,  DI2,  has a number of interesting projects including Sindice which aims to be a search engine for the Semantic Web and a forthcoming project called Webstar which aims to crawl and store most of the current web as structured data. Webstar will allow web researchers to perform large scale data experiments on this store of data, allowing researchers to focus on their goals rather than spending huge resources crawling the web and maintaining large data storage infrastructures.

Sindice and Webstar both run on commodity hardware running Linux. We’re using technologies such as Apache Hadoop and Apache HBase to store these huge datasets distributed across a large number of systems. We are initially working with a cluster of about 40 computers but expect to grow to a larger number over time.

My role in DI2 is primarily the care of this Linux infrastructure – some of the problems that we need to deal with include how to quickly install (and re-install) a cluster of 40 Linux systems, how to efficiently monitor and manage these 40 systems and how to optimise the systems for performance. We’ll use a lot of the same technologies that are used in Beowulf style clusters but we’re looking more at distributed storage rather than parallel processing so there are differences. I’ll talk a little about our approach to mass-installing the cluster in my next post.

Tags: , , , , , ,

Semantic Web enabled Blog

Wednesday, December 6th, 2006 | semantic web, web, xml | No Comments

I was at a presentation recently from the Digital Enterprise Research Institute (DERI) on some of their current work. We do a lot of work with Semantic Web technologies with our partner Profium. Profium’s products use Semantic Web technologies in certain niches such as the news and media industries where the benefits of Semantic Web in managing large amounts of metadata bring clear business advantages.

Outside of such niches, I’ve found it difficult to see where or how Semantic Web technology would be adopted by the mainstream. It was great to see that the folks at DERI have been busy working on just such applications. One of their current projects is the Semantically-Interlinked Online Communities Project which is developing tools which will ultimately allow the islands of information in blogs, forums and mailing lists to be accessed in whatever way a person wishes rather than requiring a person to access each source of information individually. The SIOC project will also make it easier to link information in each of these different media or indeed to mine the information stored in various locations and create your own virtual medium with a user interface of your own creation. I think the area of community software such as forums, blogs and mailing lists is eminently suitable for semantic web technologies – there are massive amounts of information in such islands around the Internet, unfortunately, at the moment it is very difficult to access this information and separate the signal from the noise.

To do my bit for the nascent semantic web I’ve installed SIOC Exporter for WordPress on this blog. This plugin allows any blog using WordPress to export SIOC metadata about the blog. Wahey, Applepie Solutions is on the Semantic Web!

For other bloggers and system administrators who are interested in this, it is a very straightforward WordPress plugin to install – just follow the INSTALL document that comes with the plugin files.
The DERI folks also had a poster session where they demonstrated other practical applications including the Semantic Radar for Firefox extension. This nifty Firefox extension scans each page you open in your browser for Semantic Web metadata (RDF) and flags the presence of such data on a page with a little icon in the status bar. At the moment it only handles a limited number of types of metadata (including SIOC, FOAF and DOAP) but over time this will should expand. It can also ping the Semantic Web Ping Service allowing others to learn about your metadata (and the pages they describe).

It’s good to finally see some maintstream developments in the Semantic Web world .. hopefully this is only the beginning.