<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SPDLab &#187; replication</title>
	<atom:link href="http://spdlab.net/tag/replication/feed" rel="self" type="application/rss+xml" />
	<link>http://spdlab.net</link>
	<description></description>
	<lastBuildDate>Mon, 31 May 2010 15:16:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0-RC1</generator>
		<item>
		<title>Postgresql asynchronous replication &#8211; londiste + autossh</title>
		<link>http://spdlab.net/postgresql-asynchronous-replication-londiste-autossh</link>
		<comments>http://spdlab.net/postgresql-asynchronous-replication-londiste-autossh#comments</comments>
		<pubDate>Wed, 01 Apr 2009 15:10:59 +0000</pubDate>
		<dc:creator>SPDLab</dc:creator>
				<category><![CDATA[Tips & tricks]]></category>
		<category><![CDATA[postgresql]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[skytools]]></category>

		<guid isPermaLink="false">http://78.46.87.37/~spdlab/?p=13</guid>
		<description><![CDATA[As a part of the project, I needed to make replication of postgresql database to be available on different server for analization use. Main database is being filled constantly with data while the replicated database needs to be accessible by several services in read-only mode on different segment of the network. Because of that and [...]]]></description>
			<content:encoded><![CDATA[<p>As a part of the project, I needed to make replication of postgresql  database to be available on different server for analization use.</p>
<p>Main database is being filled constantly with data while the  replicated database needs to be accessible by several services in  read-only mode on different segment of the network. Because of that and  need of local backup direct connection of those services to main  database wasn&#8217;t an option. So, plan for postgresql replication was  needed. Also, as these two servers aren&#8217;t connected by secure network or  any kind of VPN somehow they needed to have communication channel  between them for transfering of the replication data solved &#8211; in a  secure and least obtrusive way possible. Data status on the slave side  doesn&#8217;t affect main server and there is possiblity of temporary  out-of-syncs (in case of some network failure) which is good enough for  the requirements of the project so replication can be done  asynchonously.<span id="more-13"></span></p>
<p>Unfortunately, postgresql (still) doesn&#8217;t have native replication  mechanisms so some help from postgresql replication projects was needed.  Although <a href="http://pgfoundry.org/" target="_blank">pgFoundry</a> has quite  a few replication projects listed only few were a real option (working  with current versions of postgresql, no need to patch production  databases, project in relatively stable state, simple master-slave  asycnhronous replication). In the end <a href="https://developer.skype.com/SkypeGarage/DbProjects/SkyTools" target="_blank">SkyTools</a>, an open sourced project by the skype  developers (seems like they are using this also for replication of their  postgresql cluster) was chosen for the job.</p>
<p>As for obtaining secure communication between the servers part &#8211; good  ol&#8217; ssh is used, one port needed, one-way connection (from slave  replication server to master) to keep things secure as much as possible  on the replicated server.</p>
<p>So, with tools chosen here is how it&#8217;s all put together, we&#8217;ll use  some abbreviations to keep things simple:</p>
<ul>
<li>MasterIP  &#8211; IP of the master postgresql server</li>
<li>MPort &#8211; port of the postgresql server on master server</li>
<li>MDbase &#8211; master database name</li>
<li>SPort &#8211; port of the postgresql server on slave</li>
<li>MSPort &#8211; local tunneled port for postgresql master server on the  slave</li>
<li>SDbase &#8211; slave database name</li>
</ul>
<h2>Connection part</h2>
<p>For the use, simple tunneling of the master postgresql port to slave  with <span style="color: #808080;">ssh -L MSPort:127.0.0.1:MPort  MasterIP</span> fulfills our needs. Although, as things go, once  connection is broken it&#8217;s bye-bye replication so we need to make sure  this connection stays up and keeps the tunnel open for the data.</p>
<p>So, meet <a href="http://www.harding.motd.ca/autossh/" target="_blank">autossh</a>,  a ssh connection monitoring program. After install (either thru apt/yum  or manual compile) you just need to start it, this thing doesn&#8217;t have  much features but it does what it&#8217;s intended for remarkably well. It  doesn&#8217;t just monitor the ssh process, but also checks for traffic on the  connection and checks for the ssh connectivity (so it doesn&#8217;t try to  repeatedly login to server in case of wrong credentials or network  failure). Traffic checking on the connection is made with few additional  ports, their range is set by the -M parameter.</p>
<p>So to start the tunnel and keep it open (use of ssh authentication  keys is recommended here):</p>
<p><span style="color: #808080;"><br />
autossh -M 20000 -f ssh -L MSPort:127.0.0.1:MPort MasterIP<br />
</span></p>
<h2>Preparation of the postgresql server(s)</h2>
<p>As already stated, very few modifications need to be done on the  database servers itself, mainly it&#8217;s adding a few python functions and  tables needed for skytools&#8230;</p>
<p>Install the SkyTools from <a href="http://pgfoundry.org/projects/skytools/" target="_blank">pgFoundry  page</a> (simple <span style="color: #808080;">./configure &amp;&amp;  make &amp;&amp; make install</span> usually does the trick, although  some dependencies might be needed; like postgresql-devel and  python-psycopg2 packages). Additional note for the RedHat users:  skytools modules by default get installed in &#8220;/usr/local/lib/python..&#8221;,  while the system expects them at &#8220;/usr/lib/python&#8230;&#8221; &#8211; symlink or  addition to path will solve this without much problems.</p>
<p>Slave server should also contain the same schema for the database  that needs to be replicated as it&#8217;s existent on master server.</p>
<h2>SkyTools configuration</h2>
<p>SkyTools need to be installed on both master and slave (theoreticaly  it could work from slave only but it would probably be too slow for this  type of connection between the servers), and config files shall be  created by hand; to keep things neat and clean we make /etc/skytools  directory on both servers and make config files:</p>
<h3>On master server</h3>
<p>/etc/skytools/ticker.ini</p>
<p>[pgqadm]<br />
job_name = pgqadm_ticker<br />
db = dbname=MDbase port=MPort host=127.0.0.1<br />
# how often to run maintenance [seconds]<br />
maint_delay = 600<br />
# how often to check for activity [seconds]<br />
loop_delay = 0.1<br />
logfile = /tmp/%(job_name)s.log<br />
pidfile = /tmp/%(job_name)s.pid</p>
<h3>On slave server</h3>
<p>/etc/skytools/conf.ini</p>
<p>[londiste]<br />
job_name = SPDLABreplicate_dbase<br />
provider_db = dbname=MDbase port=MSPort host=127.0.0.1<br />
subscriber_db = dbname=SDbase port=SPort host=127.0.0.1<br />
# it will be used as sql ident so no dots/spaces<br />
pgq_queue_name = SPDLABreplicator<br />
logfile = /tmp/%(job_name)s.log<br />
pidfile = /tmp/%(job_name)s.pid</p>
<h2>Database preparation and replication start</h2>
<p>After this is done, we can start creating the neccessary  modifications to the databases and start the replication (commands  should be executed by the postgresql superuser, each command states at  which server):</p>
<p><strong>ON MASTER </strong>- <span style="color: #808080;">pgqadm.py  /etc/skytools/ticker.ini install</span> (installing the ticker sql on master  database)</p>
<p><strong>ON SLAVE</strong> &#8211; <span style="color: #808080;">londiste.py  /etc/skytools/conf.ini provider install</span> (installing the londiste provider  sql)</p>
<p><strong>ON SLAVE</strong> &#8211; <span style="color: #808080;">londiste.py  /etc/skytools/conf.ini subscriber install</span> (installing the londiste  subscriber sql)</p>
<p><strong>ON MASTER</strong> &#8211; <span style="color: #808080;">pgqadm.py  /etc/skytools/ticker.ini register SPDLABreplicator SPDLABreplicate_dbase</span> (registering  the consumer for ticker)</p>
<p><strong>ON MASTER</strong> &#8211; <span style="color: #808080;">pgqadm.py -d  /etc/skytools/ticker.ini ticker</span> (starting the ticker daemon; can be  stopped with -s switch)</p>
<p><strong>ON SLAVE</strong> &#8211; <span style="color: #808080;">londiste.py  /etc/skytools/conf.ini provider add &#8211;all</span> (adding all tables on provider  for replication)</p>
<p><strong>ON SLAVE</strong> &#8211; <span style="color: #808080;">londiste.py  /etc/skytools/conf.ini subscriber add &#8211;all</span> (adding all  tables on subscriber for replication)</p>
<p><strong>ON SLAVE</strong> &#8211; <span style="color: #808080;">londiste.py -d  /etc/skytools/conf.ini replay</span> (starting the londiste daemon, can be  stopped with -s switch)</p>
<p>Well, after this replication should have started and slave would be  in sync and kept in that state. In case that&#8217;s not the so, check the log  files defined in config files and use these commands for debugging:</p>
<p><strong>ON MASTER</strong> &#8211; <span style="color: #808080;">pgqadm.py  /etc/skytools/ticker.ini status</span> (checking the status of the ticker and  updates)</p>
<p><strong>ON SLAVE</strong> &#8211; <span style="color: #808080;">londiste.py  /etc/skytools/conf.ini subscriber tables</span> (checking the state of the tables)</p>
<h2>Additional notes</h2>
<p>SkyTools seems to be quite undocumented regarding these procedures of  installing and maintaing sync for replication and steps explained above  seem to be working fine for me, while you might get problems somewhere  along the way; so it&#8217;s good to keep an eye on the <a href="http://skytools.projects.postgresql.org/doc/" target="_blank">original  documentation</a> when some additional help is needed&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://spdlab.net/postgresql-asynchronous-replication-londiste-autossh/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
