Understanding the p2.index file

Understanding the p2.index file

Lately the p2.index file (or lack thereof) has caused a serious problem for the Eclipse foundation infrastructure. Denis highlighted the problems the file causes, but I thought it might help us architect a solution if we better understood the why. Some people have suggested we just remove the file, or combine all the p2 related files into a single file (site.xml?). While this might plug the current leak, it will most likely lead to problems elsewhere. Like most things, the introduction of this file was done to solve a problem. Of course the road to hell is paved with good intentions.

the road to hell is paved with good intentions Understanding the p2.index file

To help understand the p2.index file, it´s important to understand p2. The Eclipse provisioning platform (p2) is notsimply an update mechanism for Eclipse. It is a platform which can be instantiated and extended to construct provisioning solutions. The most common solution is the update mechanism we see in Eclipse, but that´s far from the only use of p2. One such extension is the ability for engineers to add new repository formats.

p2 typically stores its metadata in XML files (content.jar or content.xml). But this is just one format! There is actually an extension point which you can use to contribute your own formats. Using this extension point, someone could expose Maven artifacts or even OBR repositories, as p2 metadata. p2 doesn´t know anything about the different formats, and when instructed to load a repository at http://example.com, it will simply cycle through each format which has been contributed, looking for metadata (content.jar, compositeContent.jar, site.xml mavenArtifacts.mvn, obrRepository.obr, etc/).

Now here´s the problem. What if you want to control the order in which p2 searches? More importantly, what if you want to control the order and offer a fall-back (Use the OBR format if the client can read it, otherwise use the standard content.xml format)? Furthermore, what if you want do dynamically change the repository format? These are server configuration problems, that is, it´s up to the server administrator (not the client) to give advice about what repository formats are supported. This is what the p2.index file offers.

The p2.index file is the first file p2 reads, and it gives advice to the client about what repositories exist on the server. It is simply a property file, listing the type of repositories supported and the order they should be tried:

 version = 1
 metadata.repository.factory.order = obrRepository.obr, content.xml,\!

This example instructs the p2 client to use the OBR format if it understands it, otherwise it should fall back to the standard XML file. If it can´t read either of these two, it should give up because there is nothing else here.

If the file doesn´t exist, p2 will fall-back to an exhaustive search of all the known formats. We were careful to make sure that p2 could handle the case in which the file does not exist since we wanted to maintain backwards compatibility with all exiting repositories.

So the question is, what can we do to help the foundation? Can we remove this file (and break anybody using co-hosted repositories). Should we advise people to include a very simply p2.index file in each repository (although I´m not convinced that reading a small file is better than a 404 but I´ll admit I don´t know for sure). Should we try to improve the caching of the file (although up-to-date checks for a small file are as expensive as fetching the file), or should we hard-code a special case for eclipse.org (Eclipse.org won´t support p2.index, but other domains can).

If you have suggestions (and the resources to help implement them) on how we can solve the bandwidth problem, while not breaking compatibility and solving the co-hosted repository problem, please help us on Bug 381598.

1 Comment