Semantic Web enabled Blog

Wednesday, December 6th, 2006 | semantic web, web, xml | No Comments

I was at a presentation recently from the Digital Enterprise Research Institute (DERI) on some of their current work. We do a lot of work with Semantic Web technologies with our partner Profium. Profium’s products use Semantic Web technologies in certain niches such as the news and media industries where the benefits of Semantic Web in managing large amounts of metadata bring clear business advantages.

Outside of such niches, I’ve found it difficult to see where or how Semantic Web technology would be adopted by the mainstream. It was great to see that the folks at DERI have been busy working on just such applications. One of their current projects is the Semantically-Interlinked Online Communities Project which is developing tools which will ultimately allow the islands of information in blogs, forums and mailing lists to be accessed in whatever way a person wishes rather than requiring a person to access each source of information individually. The SIOC project will also make it easier to link information in each of these different media or indeed to mine the information stored in various locations and create your own virtual medium with a user interface of your own creation. I think the area of community software such as forums, blogs and mailing lists is eminently suitable for semantic web technologies – there are massive amounts of information in such islands around the Internet, unfortunately, at the moment it is very difficult to access this information and separate the signal from the noise.

To do my bit for the nascent semantic web I’ve installed SIOC Exporter for WordPress on this blog. This plugin allows any blog using WordPress to export SIOC metadata about the blog. Wahey, Applepie Solutions is on the Semantic Web!

For other bloggers and system administrators who are interested in this, it is a very straightforward WordPress plugin to install – just follow the INSTALL document that comes with the plugin files.
The DERI folks also had a poster session where they demonstrated other practical applications including the Semantic Radar for Firefox extension. This nifty Firefox extension scans each page you open in your browser for Semantic Web metadata (RDF) and flags the presence of such data on a page with a little icon in the status bar. At the moment it only handles a limited number of types of metadata (including SIOC, FOAF and DOAP) but over time this will should expand. It can also ping the Semantic Web Ping Service allowing others to learn about your metadata (and the pages they describe).

It’s good to finally see some maintstream developments in the Semantic Web world .. hopefully this is only the beginning.

Debian GNU/Linux on a HP NC6400

Monday, September 18th, 2006 | hardware, linux | 17 Comments

Introduction

We purchased a new HP NC6400 recently and I thought I’d document my experiences for reference by anyone else planning on installing a Linux distribution on an NC6400. Overall, the installation was pretty painless and the notebook is fully usable after a few hours of install work.

The notebook will be used for development and general office-work and needs to be reasonably portable – we regularly travel to our customers around Ireland and in places like Finland and France so durability and battery life are an issue for us. We’ve previously bought hardware from Dell and HP but I was reluctant to purchase another Dell notebook after the problems we had with the backlight on a Dell Latitude D600 (which seems to be an ACPI problem which others have worked around in various ways). I guess in this day and age I expect to be able to use my hardware without problems on Linux and to date, none of Dell’s BIOS upgrades properly address this problem (and it seems like this isn’t the only model with ACPI problems) so I’m inclined to take my business to another vendor. We previously purchased an NC6000 from HP and it has worked flawlessly with both Windows XP and Linux so I decided to go with them again given their track record (for the record, I used to work for HP so I may be biased).

This NC6400 (you gotta love those URLs, memorable to say the least) seems to meet our requirements – it is pretty powerful using the new Intel Core Duo – battery time is supposed to be around 4.5 hours (the Core Duo has a good reputation for power consumption) and it seems to be reasonably light while providing a good hard-drive and all the bells and whistles I’d expect on a modern laptop (including 802.11a,b and g wireless, firewire and usb 2.0 connections and so on). My only reservations were about the widescreen and the embedded graphics card. Widescreen may be useful for a system used primarily for watching movies, but it makes for a bulkier notebook which may be a problem both for carrying and for using in cramped locations like airplanes (business class is too extravagant for us I’m afraid). Integrated graphics cards like the Intel borrow memory from main system memory which can have a performance impact – also, for general 3d graphics Intel don’t have a great reputation. However, given that purchasing notebooks is inevitably about trade-offs, the NC6400 seemed, on balance, to meet our requirements so I went ahead and ordered it.

The system arrived the following day (we use Passax in Galway and I’m always pleasantly surprised at how fast they deliver new systems to us) and was exactly as expected, except for the graphics. Despite the specification on the HP website listing it has having an Intel integrated card, the system actually carries an ATI Radeon Mobility X1300. A pleasant surprise, the performance of an X1300 should be generally far better. So, in summary, the system has the following spec,

  • 2GHz Intel Core Duo Processor
  • 1GB Physical Memory
  • 5400rpm 80GB hard-drive
  • ATI Radeon Mobility X1300 graphics
  • Dual layer DVD burner
  • Intel a/b/g wireless card
  • Gigabit ethernet

My concerns about the widescreen making the notebook too bulky were mostly unfounded – overall, the NC6400 is no larger than the 15″ NC6000 we already have in the office. Generally, the build quality of the NC6400 seems pretty good and I’d expect it to travel as well as its older brother, the NC6000.

Installation

I downloaded the beta 3 netinst installer for Debian Etch (the next release of Debian tenatively slated for final release in December of this year). Before I could installed Debian on the notebook, I had to make a partition available. In the past, I’ve repartitioned systems and then reinstalled Windows and Linux on the system – this is pretty time consuming and I’ve been trying to avoid it where possible. I’ve had good results with Partition Magic in the past but don’t have a copy in the office so I tried the latest version (0.3.1) of the GParted LiveCD – an open source tool for resizing partitions. I’ve seen problems with GParted but discovered that these generally stem from GParted being pretty cautious about resizing NTFS partitions if it sees any errors on them. It seems even factory installed Windows partitions often have non-fatal errors on them. Windows doesn’t seem to worry about them normally (even running the disk checking tool doesn’t flag or fix them) – you need to run chkdsk/f from the command prompt in order to get Windows to fix them. Once you run that and reboot twice (hey, it’s Windows, stop sniggering back there) – GParted seems to work fine.

I booted the etch install cd and, using the following options, installed the basic system,

  • I opted for an expertgui install at the the installer boot prompt rather than a standard install — in order to have maximum control over the process if something went wrong during the install.
  • For partitioning, I opted for 2 logical partitions, a 2GB swap partition and a 20GB root partition. I normally like to split up the filesystem across multiple partitions but in the case of notebooks, it’s hard to anticipate how the user will want to use it … so I’m inclined to lump everything into the one filesystem (I don’t recommend that approach for production servers).
  • When selecting a kernel – the installer didn’t offer a choice of an SMP kernel so I went with linux-image-2.6-686 making a mental note to install the SMP version of this afterwards to utilise both processor cores.
  • I chose an Irish mirror to install from and opted for the standard system – I’d prefer to manually select the packages I want after the initial install rather than suck in a load of packages I don’t need (or use a standard Applepie Solutions package list I’ve already prepared and dpkg –set-selections ).
  • Installation proceeded smoothly until the boot-loader section. This failed with a file not found error when trying to install grub. Some digging around found the Debian bug 380351. As a workaround, I logged onto the console and ran the following,
  • chroot /target
  • /usr/sbin/grun-install –recheck “(hd0)”
  • As an alternative, if you want the installer to run smoothly without errors, you can do the following,
    • chroot target
    • ln -s /usr/bin/grub-install /sbin/grub-install
  • At this point, the base system was installed and we were ready to reboot into the basic system.
  • After the grub fix, the bootloader was installed and configured ok – with both Linux and the Windows XP install available as boot options.
  • Conclusion

    After logging into the system for the first time, I went and installed the SMP version of the kernel. A reboot into this kernel confirms that both processor cores are available. Running acpi -V shows some thermal information but nothing about fan speeds or which sensor is which. I tried installed lm-sensors but it doesn’t seem to find any usable sensors. This is unfortunate and I’ll need to investigate further but the fan seems to run ok, and temperatures seem ok on the system so there is no danger of frying both processors due to overheating anyways 🙂

    After that, it was just a case of installing a usable graphical environment including GNOME, Openoffice and Eclipse (which has recently been added to Debian). This went very smoothly although the ati driver did not work with the X1300. I tried the vesa driver and it starts up but doesn’t allow the display to operate at the native 1440×900 resolution of the LCD display on the notebook. Given my public stance on proprietary drivers it was a short trip to ATI’s site to download their binary Linux drivers. The installation went smoothly and the system came back up with a display at the native resolution of 1440×900.

    The wired ethernet works fine – I haven’t tested the wireless yet but since Intel make good open source drivers available for their chipsets including the 2100, 2200BG, 2915ABG and 3945ABG I’m confident we should have no problems getting it up and running (but I will post a follow-up if there are any issues).

    Queries and comments welcome – especially from other NC6400 users running Linux. This page also has some notes on running Linux on an NC6400.

    Linux RAID solutions (Part II)

    Monday, September 18th, 2006 | hardware, linux | No Comments

    I finished my last article on Linux RAID solutions with a discussion of the hard drive technology we’d be investigating. Once we have the system up and running and reliably storing data, we’re going to have to look into backing it all up. The traditional approach to backups has been to regularly dump your data to tape. This worked pretty well until hard-drive capacity started significantly over-taking tape-drive capacities.

    Linear Tape-Open is one of the most common high capacity tape formats available these days. Its raw capacity runs from 100GB (LTO-1) up to 400GB (LTO-3). Tape drive manufacturers like to prominently quote the data capacity in terms of compressed data capacity rather than raw data capacity. This is slightly misleading since it depends on the compressibility of your data. Text files will compress very well, you can normally expect them to halve in size as the LTO drive compresses them. Binary data files on the other hand, may not compress at all. Binary data is what our customer will be generating, and the applications generating it may very well be using compression on the fly already .. in which case we’ll see need closer to the raw storage capacity than the compressed capacity. Something to be aware of.

    Nowadays, if you’re using SAN technology, you can normally take a snapshot of your data and copy it onto another part of your SAN array. Assuming you have enough storage, this is a good quick way of taking a reliable copy of your data. Of course, you may want to copy it onto another SAN at a remote location if you want a disaster-tolerant solution. A lot of commercial providers are starting to show up in this space, in Ireland we have companies like Central DataBank and Hosting365 (who wonder why anyone would use tape backup anymore). I guess there are pros and cons and I’d like to see a detailed cost-benefit analysis before I could definitively say that tape backup never makes sense – at least in countries like Ireland where broadband is still relatively expensive and the providers like to keep the Asymmetric in ADSL, it may not be economically viable to upload large amounts of data to a 3rd party backup provider.

    If you happen to have an organisation with multiple offices and the bandwidth between them, making offsite backups to offsite servers makes absolute sense – maybe instead of tape backup. If you don’t have that option, and especially if you are generating a large amount of data on a regular basis you may need to look at a technology such as LTO-3. We’re still trying to decide whether LTO-2 will meet our needs or whether we’ll need to look at LTO-3. We have to do another round of sizing estimates with our customer and decide on a backup schedule – if we’re doing weekly backups we may be ok with LTO-2, if it’s monthly LTO-3 may be neccesary. For business critical data, nightly or weekly backups are vital but in our case, monthly backups may be sufficient, especially if we have good availability and reliability from the underlying storage array.

    Once we’ve selected a drive from one of the main vendors (all the big names supply LTO hardware so we’ll probably go with the vendor the customer already has a relationship with, the technology is largely similar anyways), we’ll need to look at which backup software to use. For small setups, its still common enough to see the tar command (or cpio) being used. For larger, more complex configurations the 2 most common open source packages are Bacula and Amanda. Both packages have a good reputation, while both have some limitations . Our current plan is to evaluate both during initial installation of the storage system. If neither solution meets our customer’s needs, we may also investigate Arkeia’s SmartBackup which looks like it will also meet our requirements.

    We’re planning on deploying this system in the next few months, I’ll report back on how that goes some time after we finished testing.