Backup-restore procedures

**
Auteur : Thomas Bellembois ([University of Rennes
1|http://http://www.univ-rennes1.fr])

Introduction


The ESUP-WebDAV server is based on the Jakarta Slide WebDAV server. In this document the terms "Slide" and "ESUP WebDAV server" refer to the same entity : The ESUP WebDAV server.

Slide does not provide a user-friendly backup/restore interface. But given that Slide content and metadata are well structured it is possible to backup and restore them easily.

The ESUP Portail consortium does not provide a friendly tool to handle backup/restore procedures, but we have tested two procedures that work properly.

Feel free to contact me for further details.

Notational conventions :


To understand how to bakup and particularly restore Slide content and metadata, you must be aware of how Slide handle these datas.

In the build.properties file of the ESUP Webdav server you have defined 5 parameters :

slide.rootPath = /home/tbellemb/esup-serveur-WebDav-3.5/SlideData
slide.contentRootStore = ${slide.rootPath}/content/store
slide.contentWorkStore = ${slide.rootPath}/content/work
slide.metadataRootStore = ${slide.rootPath}/metadata/store
slide.metadataWorkStore = ${slide.rootPath}/metadata/work

Note that once the server is deployed, you can get these parameters in the webapps/Slide/Domain.xml file in the deployment directory :

...
<nodestore classname="org.apache.slide.store.txfile.TxXMLFileDescriptorsStore">
<parameter name="rootpath">/home/tbellemb/esup-serveur-WebDav-3.5/SlideData/metadata/store</parameter>
<parameter name="workpath">/home/tbellemb/esup-serveur-WebDav-3.5/SlideData/metadata/work</parameter>
<parameter name="defer-saving">true</parameter>
<parameter name="timeout">120</parameter>
</nodestore>
...
<contentstore classname="org.apache.slide.store.txfile.TxFileContentStore">
<parameter name="rootpath">/home/tbellemb/esup-serveur-WebDav-3.5/SlideData/content/store</parameter>
<parameter name="workpath">/home/tbellemb/esup-serveur-WebDav-3.5/SlideData/content/work</parameter>
<parameter name="defer-saving">true</parameter>
<parameter name="timeout">120</parameter>
</contentstore>
...

The work directories are used as temporary directories so we will leave them aside...

We have then two branches :

Slide uses the same hierarchy to store content and metadata as the one created by users.

Let's have a look to this hierarchy :

!dataStructure.png!In the metadata branche, each directory contains descriptor files (*.def.xml) of its files and sub-directories. These xml files contain many informations such as the creation date, the display name or the permissions on the resource. Of course files and directories descriptors are a bit differents and contain specific informations. A directory descriptor enumerates its files and sub-directories, its parent and permissions.

Descriptor of the /files directory :

<children>
 <child name="quotas" uuri="/files/quotas"/>
 <child name="homedirs" uuri="/files/homedirs"/>
 <child name="partages" uuri="/files/partages"/>
 <child name="test" uuri="/files/test"/>
</children>
...
<parents>
 <parent name="files" uuri="/"/>
</parents>
...
<permissions>
 <permission subjectUri="/roles/uPortal/ToutLeMonde/Personnels/CRI/SI" actionUri="all" inheritable="true" negative="false" />
</permissions>

The content branche contains files (in the meaning of binary content). The filenames are ended by the file revision number - always _1.0 given that the ESUP WebDAV server does not use versioning.

The content and metadata branches are strongly linked. Adding a file in the content branche would have no effect (the file would not be visible on the server) if you do not modify the descritptor of the directory you want to put the file in.

1. Descriptors (.def.xml) files MUST NOT be modified while the server is running. This may crash the server.

2. If you DO NOT modify a file descriptor correctly (not well formed xml, for example) this may crash the server.


Backup/restore process


There are two main ways to restore Slide data depending on the permissions put on the folder tree you want to restore. Any way, you have to ensure that the Slide backend (myslideData) is on a backed filesystem.

Folder tree with no or few permissions (like an homedir)


Consider the following tree (resources with an * have access restrictions (permissions)) :

homedirs* -- tbellemb* -- c
                       -- b
                       -- a -- a1 -- foo3.doc
                            -- foo1.doc
                            -- foo2.doc

The user tbellemb deletes his "a" directory and would like it to be restored.

The last saved backend is (content branche) :

content -- store -- homedirs -- tbellemb -- c
                                         -- b
                                         -- a -- a1 -- foo3.doc_1.0
                                              -- foo1.doc_1.0
                                              -- foo2.doc_1.0

To restore the "a" directory :

Folder tree with many permissions (like a shared space)


Consider the following tree (resources with an * have access restrictions (permissions)) :

partages* -- SI* -- c*
                 -- b*
                 -- a* -- a1** -- a11* -- ..
                               -- a12* -- ..
                       -- foo1.doc
                       -- foo2.doc

A user deletes the "a" directory and would like it to be restored.

Using the first restore method would be a hard task and would force the administrator to put right access to each "a" sub-directory after the restoration.

The last saved backend is (content and metadata branches) :

content -- store -- partages -- SI -- c
                                   -- b
                                   -- a -- a1 -- a11 -- ..
                                              -- a12 -- ..
                                        -- foo1.doc_1.0
                                        -- foo2.doc_1.0

metadata -- store -- partages
                  -- [.def.xml]
                  -- [partages.def.xml]
                                        -- SI
                                        -- [SI.def.xml]
                                                        -- b
                                                        -- c
                                                        -- [a.def.xml]
                                                        -- [b.def.xml]
                                                        -- [c.def.xml]
                                                        -- a           -- foo1.doc.def.xml
                                                                       -- foo2.doc.def.xml



To restore the "a" directory : * Stop the WebDAV server

SI.def.xml :

<children>
      <child name="b" uuri="/partages/SI/b" />
      <child name="c" uuri="/partages/SI/c" />
      <!-- RESTORED DIRECTORY-->
      <child name="a" uuri="/partages/SI/a" />
</children>