Date: 2012-07-27 09:00:23
From: marek_surek@yahoo.co.uk
Hi,
I want to ask how exactly backup is performed on running system. I read backup part in FAQ and everything seems fine, but I'm confused with word 'seamlessly'. Therefore using programatical approach : 
1. Is backup consistency-safe on running system? Does it differ using OWLIM-SE or OWLIM-EE? My best guess is whether in OWLIM-EE is backup performed on one working node in following scenario : 
a. One working node is chosen and it stops to be up-to-date/replicated with other working nodes
b. Full backup is made on this working node
c. Working node is added back to work/replication and is updated with other nodes
2. Doesn't performing backup on running system degrade performance on such level it is unusable by high number of users?
3. Is there any way of incremental backup? The used store has tens of GB and therefore backupfile size + time needed for backup will be enormous if consistent backup cannot be made on running system.
Thank you for your support.
Best regards,
Marek

asked 03 Apr '13, 10:41

Discussion-Board-Archive's gravatar image

Discussion-B...
6.1k142160227
accept rate: 30%


Date: 2012-08-06 18:36:47
From: Luciano.Blasetti@fao.org
Ok, solved with 5.2
Thx
link

answered 03 Apr '13, 10:42

Discussion-Board-Archive's gravatar image

Discussion-B...
6.1k142160227
accept rate: 30%

Date: 2012-07-30 09:17:32
From: barry.bishop@ontotext.com
Hi Marek,
There are a number of ways to make a backup, each with their pros and 
cons. I'll try to list them all here:
1. OWLIM-Enterprise
Using OWLIM-Enterprise, one method is indeed to take a worker node out 
of the cluster, shut it down, back up the storage files, restart it and 
add it back to the cluster. Depending on the number of updates that have 
occurred while the worker node was absent, either just the missing 
updates are replayed by the active master to the worker node or a full 
replication takes place.
pros: a complete image is taken, a worker can be recreated without any 
loading/inference required
cons: cluster query performance can drop while a worker is offline, 
cluster could become read-only while if a deep replication is required 
after adding the worker node back to the cluster
2. OWLIM-Enterprise and OWLIM-SE
Execute a query to retrieve all explicit statements using the special 
explicit graph name:
SELECT *
FROM <http://www.ontotext.com/explicit>
{
{ ?s ?p ?o }
UNION
{ GRAPH ?g { ?s ?p ?o } }
}
pros: easy to do
cons: some programming is required to store the results in a suitable 
format, some overlap will occur with statements appearing named graphs 
and also the default graph, will not work with very large databases over 
Sesame HTTP protocol as the results are fetched in one go
3. OWLIM-Enterprise and OWLIM-SE
Use the graph store protocol, e.g. retrieve all explicit statements and 
store them in TriG format:
curl -X GET -H "Accept:application/x-trig" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit" 
pros: can be executed on the command line against a master or worker 
node, will work with very large databases by streaming results
cons: restoring a backup requires loading it like a normal file - incurs 
the reasoning overhead
You can also use the N-Quads format which is supported in OWLIM 5.2:
curl -X GET -H "Accept:text/x-nquads" 
"http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit" 
Later in the year we will implement a new OWLIM plug-in for making 
online back-ups, but this is not available yet. My choice would be to 
use the graph store protocol.
I hope this helps,
barry
Barry Bishop
OWLIM Product Manager
Ontotext AD
Tel: +43 650 2000 237
email: barry.bishop at ontotext.com
skype: bazbishop
www.ontotext.com
link

answered 03 Apr '13, 10:41

Discussion-Board-Archive's gravatar image

Discussion-B...
6.1k142160227
accept rate: 30%

Date: 2012-07-30 10:17:49
From: marek_surek@yahoo.co.uk
Thank you Barry for your answer,
I tried third option but one thing came to my mind. Is it consistency safe? The procedure takes on big dataset large amount of time. Therefore I would like to ask :
Let's say backup procedure takes 300seconds. 
1. If one statement is added right after I run backup procedure, is it guaranteed it won't be involved in backup file? When I used method getStatements() in Sesame, it seems to me it is guaranteed because it is loaded into memory (which is useless for large databases as we couldn't reserve tens of GB of memory only to backup procedure).?
2. Or is the repository during backup read-only?
I tried to do SELECT query during backuping and it worked fine, so I think only performance drop during backup procedure is occured. Am I right?
Thank you for answers,
Marek
link

answered 03 Apr '13, 10:41

Discussion-Board-Archive's gravatar image

Discussion-B...
6.1k142160227
accept rate: 30%

Date: 2012-07-30 11:41:39
From: barry.bishop@ontotext.com
Hi Marek,
OWLIM implements transaction isolation, so once a query starts, its 
results will not be affected by any updated that is committed while it 
is running.
(This is achieved by associating a transaction with its own copy of the 
page index(es) and implementing copy-on-write semantics for database pages.)
getStatements() will stream results when used on a local instance of 
OWLIM, but not remotely using the HTTP interface (RemoteRepositoryManager).
However, I am told that the new SparqlRepository class will stream 
results properly and would be suitable for running your backup task. 
However, this class does not fully support all Repository operations 
(yet), so is not a direct replacement for HTTPRepository (yet).
I hope this helps,
barry
link

answered 03 Apr '13, 10:41

Discussion-Board-Archive's gravatar image

Discussion-B...
6.1k142160227
accept rate: 30%

Date: 2012-08-06 14:34:21
From: Luciano.Blasetti@fao.org
Hi Barry,
I tried with
curl -X GET -H "Accept:application/x-trig" "http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit"<http://localhost:8080/openrdf-sesame/repositories/repo_id/rdf-graphs/service?graph=http://www.ontotext.com/explicit> > backup.trig
but it seems that the produced trig file doesn't contain the original context names (all the triples are exported under the same unnamed context).
Any hints?
Thx,
Luciano
link

answered 03 Apr '13, 10:41

Discussion-Board-Archive's gravatar image

Discussion-B...
6.1k142160227
accept rate: 30%

Date: 2012-08-06 14:40:10
From: Luciano.Blasetti@fao.org
Actually it seems to work fine on 5.2 but not on 5.1.
link

answered 03 Apr '13, 10:42

Discussion-Board-Archive's gravatar image

Discussion-B...
6.1k142160227
accept rate: 30%

Date: 2012-08-06 16:34:20
From: barry.bishop@ontotext.com
Hello Luciano,
I'm afraid I don't remember the exact issue, but there was a recent fix 
to OWLIM (most likely included in 5.2) that would allow this method of 
making a backup to work.
Given that it works in 5.2, are you able to continue on this basis?
Regards,
barry
link

answered 03 Apr '13, 10:42

Discussion-Board-Archive's gravatar image

Discussion-B...
6.1k142160227
accept rate: 30%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×261
×242

Asked: 03 Apr '13, 10:41

Seen: 1,818 times

Last updated: 27 Jun, 07:01

powered by BitNami OSQA