<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to 487: The pooler process doesn't exit some times after excuting "stop all"</title><link>https://sourceforge.net/p/postgres-xc/bugs/487/</link><description>Recent changes to 487: The pooler process doesn't exit some times after excuting "stop all"</description><atom:link href="https://sourceforge.net/p/postgres-xc/bugs/487/feed.rss" rel="self"/><language>en</language><lastBuildDate>Mon, 04 Aug 2014 08:13:22 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/postgres-xc/bugs/487/feed.rss" rel="self" type="application/rss+xml"/><item><title>#487 The pooler process doesn't exit some times after excuting "stop all"</title><link>https://sourceforge.net/p/postgres-xc/bugs/487/?limit=25#5361/0cf1</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;I found a problem about autovacuum process&lt;/p&gt;
&lt;p&gt;The code here in routine AutoVacLauncherMain&lt;/p&gt;
&lt;p&gt;if (sigsetjmp(local_sigjmp_buf, 1) != 0)  -- A&lt;br /&gt;
{ ... }&lt;/p&gt;
&lt;p&gt;rebuild_database_list(InvalidOid);        -- B&lt;/p&gt;
&lt;p&gt;if there is a error in rebuild_database_list, the routine will jump to A, and a deadlock will happen. The condition happened when I execute "stop all" in pgxc_ctl, the other processes had exited except the logger and autovacuum.&lt;br /&gt;
The logger didn't exit because the autovacuum generated logs. &lt;/p&gt;
&lt;p&gt;Stack of vacuum generates error:&lt;br /&gt;
0  GetSnapshotDataCoordinator (snapshot=0xcb4240 &amp;lt;CurrentSnapshotData&amp;gt;) at procarray.c:3058&lt;br /&gt;
1  0x0000000000730b65 in GetPGXCSnapshotData (snapshot=0xcb4240 &amp;lt;CurrentSnapshotData&amp;gt;) at procarray.c:2837&lt;br /&gt;
2  0x000000000072f0df in GetSnapshotData (snapshot=0xcb4240 &amp;lt;CurrentSnapshotData&amp;gt;) at procarray.c:1411&lt;br /&gt;
3  0x000000000089f3b7 in GetTransactionSnapshot () at snapmgr.c:180&lt;br /&gt;
4  0x00000000006e85b9 in get_database_list () at autovacuum.c:1860&lt;br /&gt;
5  0x00000000006e7592 in rebuild_database_list (newdb=0) at autovacuum.c:976&lt;br /&gt;
6  0x00000000006e6ea7 in AutoVacLauncherMain (argc=0, argv=0x0) at autovacuum.c:586&lt;br /&gt;
7  0x00000000006e6b5b in StartAutoVacLauncher () at autovacuum.c:391&lt;br /&gt;
8  0x00000000006f5cda in reaper (postgres_signal_arg=17) at postmaster.c:2750&lt;br /&gt;
9  &amp;lt;signal handler="" called=""&amp;gt;&lt;br /&gt;
10 0x00007fedecb65b43 in __select_nocancel () from /lib64/libc.so.6&lt;br /&gt;
11 0x00000000006f406d in ServerLoop () at postmaster.c:1662&lt;br /&gt;
12 0x00000000006f3975 in PostmasterMain (argc=5, argv=0x15aced0) at postmaster.c:1369&lt;br /&gt;
13 0x000000000065a9f9 in main (argc=5, argv=0x15aced0) at main.c:206&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">peace zone</dc:creator><pubDate>Mon, 04 Aug 2014 08:13:22 -0000</pubDate><guid>https://sourceforge.neta55e8596769172ec0e50d2e52ab27c0a1c9076bc</guid></item><item><title>#487 The pooler process doesn't exit some times after excuting "stop all"</title><link>https://sourceforge.net/p/postgres-xc/bugs/487/?limit=25#5361</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;AFAIK syslogger is using Latch mechanism which is very well considered not to lose the event at any timing. The Latch mechanism uses pipe and poll.&lt;/p&gt;
&lt;p&gt;We might need to consider the fear that postmaster doesn't sending the signal.&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">cbx</dc:creator><pubDate>Wed, 23 Jul 2014 01:25:31 -0000</pubDate><guid>https://sourceforge.net0ace87b783ced217245b424fb3c28ae310d2b6b3</guid></item><item><title>#487 The pooler process doesn't exit some times after excuting "stop all"</title><link>https://sourceforge.net/p/postgres-xc/bugs/487/?limit=25#1a50/a142/803e</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;I found pooler and syslogger were still alive after the "stop all",  the pooler stopped at select and the syslogger stopped at poll. I forgot to dump the stacks of the two processes, but I think the problem may be the same, I will dump the stack next time.&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">peace zone</dc:creator><pubDate>Tue, 22 Jul 2014 08:20:32 -0000</pubDate><guid>https://sourceforge.net9bd7e84251021bcab04d7f4d30a4befa763dd29e</guid></item><item><title>#487 The pooler process doesn't exit some times after excuting "stop all"</title><link>https://sourceforge.net/p/postgres-xc/bugs/487/?limit=25#1a50/a142/a571</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Thank you for your response!&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">peace zone</dc:creator><pubDate>Tue, 22 Jul 2014 08:14:53 -0000</pubDate><guid>https://sourceforge.netd4bd5d3a8c301cf1d43434cc8451a30d0d0d74d4</guid></item><item><title>#487 The pooler process doesn't exit some times after excuting "stop all"</title><link>https://sourceforge.net/p/postgres-xc/bugs/487/?limit=25#1a50/a142</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;SIGTERM interrupts select system-call and then select returns -1 with errno = EINTR.&lt;br /&gt;
So poolmgr can know the signal IF POOLMGR IS WAITING IN SYSTEM CALL.&lt;/p&gt;
&lt;p&gt;It means that this issue could happen when the signal is caught before select is called and the poolmgr has no connection.&lt;br /&gt;
I recommend add timeout or other strict logic.&lt;/p&gt;
&lt;p&gt;I think Syslogger doesn't have this kind of problem. Why do you think it has?&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">cbx</dc:creator><pubDate>Tue, 22 Jul 2014 07:47:54 -0000</pubDate><guid>https://sourceforge.nete7efb453577f85978266e477fac2f461d1bf1992</guid></item><item><title>#487 The pooler process doesn't exit some times after excuting "stop all"</title><link>https://sourceforge.net/p/postgres-xc/bugs/487/?limit=25#1a50</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Thank you for your response!&lt;/p&gt;
&lt;p&gt;The pooler process catches the SIGTERM which "stop all" command in pgxc_ctl sends&lt;br /&gt;
and then only sets shutdown_requested to true. The pooler process exits only when shutdown_requested = ture.&lt;/p&gt;
&lt;p&gt;In this situation the select doesn't know the ths signal is comming. &lt;/p&gt;
&lt;p&gt;Look at these codes in &lt;br /&gt;
1. PoolManagerInit&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;pqsignal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pooler_die&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;pqsignal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGTERM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pooler_die&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;pqsignal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGQUIT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pooler_quickdie&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;pooler_die&lt;br /&gt;
static void&lt;br /&gt;
pooler_die(SIGNAL_ARGS)&lt;br /&gt;
{&lt;br /&gt;
    shutdown_requested = true;&lt;br /&gt;
}&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">peace zone</dc:creator><pubDate>Mon, 21 Jul 2014 09:09:41 -0000</pubDate><guid>https://sourceforge.net432c655fe34146f882aa631ca0e6e078c599cc74</guid></item><item><title>The pooler process doesn't exit some times after excuting "stop all"</title><link>https://sourceforge.net/p/postgres-xc/bugs/487/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;The pooler process doesn't exit after I execute "stop all" command.&lt;/p&gt;
&lt;p&gt;When gdbs the process, I find it stops at this place : poolmgr.c&lt;/p&gt;
&lt;p&gt;2368        retval = select(nfds + 1, &amp;amp;rfds, NULL, NULL, NULL);&lt;br /&gt;
2369        if (shutdown_requested)&lt;/p&gt;
&lt;p&gt;When the server_id is not changed, the select will wait forever. &lt;br /&gt;
The routine has no chance to echo "shutdown_requested"&lt;/p&gt;
&lt;p&gt;Does it need to add a "timeout" in select routine?&lt;/p&gt;
&lt;p&gt;The Syslogger process has the same problem.&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">peace zone</dc:creator><pubDate>Mon, 21 Jul 2014 08:50:56 -0000</pubDate><guid>https://sourceforge.net2fd3dce875ea1e66b053e611737f522421d1d2b9</guid></item></channel></rss>