Welcome to the Builder Academy

Question Crash on Shutdown

More
06 Mar 2016 16:31 #5622 by krell
Crash on Shutdown was created by krell
I get a non-fatal crash occurring when rebooting the mud and a fatal crash when shutting down. Here's the syslog.CRASH output.

Code:
Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x0 Mar 06 15:19:30 201 :: chtmp->master. 0x379fe0f8c00 Mar 06 15:19:30 201 :: chtmp->master. 0x379fe0f8c00 Bus error (core dumped) autoscript terminated Sun Mar 6 15:19:30 UTC 2016

If the gdb output is to be believed.

Code:
$ doas gdb bin/circle lib/circle.core doas (schnes@gemini.wss-ds.org) password: GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-unknown-openbsd5.9"... Core was generated by `circle'. Program terminated with signal 10, Bus error. Loaded symbols for /home/mud/clockwerx/bin/circle Reading symbols from /usr/lib/libc.so.84.2...done. Loaded symbols for /usr/lib/libc.so.84.2 Reading symbols from /usr/libexec/ld.so...done. Loaded symbols for /usr/libexec/ld.so #0 act (str=0x377f3f2f7c2 "$n stops following $N.", hide_invisible=1, ch=0x37a53581800, obj=0x0, vict_obj=0x379fe0f8c00, type=3) at comm.c:2685 2685 if (!SENDOK(to) || (to == ch)) (gdb) bt #0 act (str=0x377f3f2f7c2 "$n stops following $N.", hide_invisible=1, ch=0x37a53581800, obj=0x0, vict_obj=0x379fe0f8c00, type=3) at comm.c:2685 #1 0x00000377f3d4c594 in stop_follower (ch=0x37a53581800) at utils.c:558 #2 0x00000377f3dca9d4 in destroy_db () at db.c:496 #3 0x00000377f3e1202d in main (argc=3, argv=0xa00f) at comm.c:363 (gdb) list 363 destroy_db(); 364 365 if (!scheck) { 366 log("Clearing other memory."); 367 free_bufpool(); /* comm.c */ 368 free_player_index(); /* players.c */ 369 free_messages(); /* fight.c */ 370 free_text_files(); /* db.c */ 371 board_clear_all(); /* boards.c */ 372 free(cmd_sort_info); /* act.informative.c */ (gdb) up #1 0x00000377f3d4c594 in stop_follower (ch=0x37a53581800) at utils.c:558 558 act("$n stops following $N.", TRUE, ch, 0, ch->master, TO_NOTVICT); (gdb) list 553 act("$n hates your guts!", FALSE, ch, 0, ch->master, TO_VICT); 554 if (affected_by_spell(ch, SPELL_CHARM)) 555 affect_from_char(ch, SPELL_CHARM); 556 } else { 557 act("You stop following $N.", FALSE, ch, 0, ch->master, TO_CHAR); 558 act("$n stops following $N.", TRUE, ch, 0, ch->master, TO_NOTVICT); 559 act("$n stops following you.", TRUE, ch, 0, ch->master, TO_VICT); 560 } 561 562 if (ch->master->followers->follower == ch) { /* Head of follower-list? */ (gdb) up #2 0x00000377f3dca9d4 in destroy_db () at db.c:496 496 stop_follower(chtmp); (gdb) list 491 while (character_list) { 492 chtmp = character_list; 493 log("chtmp->master. %p ", chtmp->master); 494 character_list = character_list->next; 495 if (chtmp->master) 496 stop_follower(chtmp); 497 free_char(chtmp); 498 } 499 500 /* Active Objects */ (gdb) down #1 0x00000377f3d4c594 in stop_follower (ch=0x37a53581800) at utils.c:558 558 act("$n stops following $N.", TRUE, ch, 0, ch->master, TO_NOTVICT); (gdb) down #0 act (str=0x377f3f2f7c2 "$n stops following $N.", hide_invisible=1, ch=0x37a53581800, obj=0x0, vict_obj=0x379fe0f8c00, type=3) at comm.c:2685 2685 if (!SENDOK(to) || (to == ch)) (gdb) list 2680 log("SYSERR: no valid target to act()!"); 2681 return NULL; 2682 } 2683 2684 for (; to; to = to->next_in_room) { 2685 if (!SENDOK(to) || (to == ch)) 2686 continue; 2687 if (hide_invisible && ch && !CAN_SEE(to, ch)) 2688 continue; 2689 if (type != TO_ROOM && to == vict_obj) (gdb) print The history is empty. (gdb) up #1 0x00000377f3d4c594 in stop_follower (ch=0x37a53581800) at utils.c:558 558 act("$n stops following $N.", TRUE, ch, 0, ch->master, TO_NOTVICT); (gdb) print The history is empty. (gdb) up #2 0x00000377f3dca9d4 in destroy_db () at db.c:496 496 stop_follower(chtmp); (gdb) print The history is empty. (gdb) up #3 0x00000377f3e1202d in main (argc=3, argv=0xa00f) at comm.c:363 363 destroy_db(); (gdb) print The history is empty. (gdb) up Initial frame selected; you cannot go up. (gdb)

Please Log in or Create an account to join the conversation.

More
06 Mar 2016 20:45 - 06 Mar 2016 20:46 #5623 by thomas
Replied by thomas on topic Crash on Shutdown
Seems it's the SENDOK macro that's giving you trouble. It's defined like this in utils.h:
Code:
/** Defines if it is ok to send a message to ch. */ #define SENDOK(ch) (((ch)->desc || SCRIPT_CHECK((ch), MTRIG_ACT)) && \ (to_sleeping || AWAKE(ch)) && \ !PLR_FLAGGED((ch), PLR_WRITING))
I'd suggest trying a slight rewrite of the relevant portion of the act() function.

From (comm.c, cirka line 2685):
Code:
for (; to; to = to->next_in_room) { if (!SENDOK(to) || (to == ch)) continue; if (hide_invisible && ch && !CAN_SEE(to, ch)) continue; if (type != TO_ROOM && to == vict_obj) continue; perform_act(str, ch, obj, vict_obj, to); }

to something like (browser code):
Code:
for (; to; to = to->next_in_room) { if (!ch->desc && !SCRIPT_CHECK((ch), MTRIG_ACT)) continue; if (!to_sleeping && !AWAKE(ch)) continue; if (PLR_FLAGGED((ch), PLR_WRITING)) continue; if (to == ch) continue; if (hide_invisible && ch && !CAN_SEE(to, ch)) continue; if (type != TO_ROOM && to == vict_obj) continue; perform_act(str, ch, obj, vict_obj, to); }
This should make it more obvious which check is failing.
I assume it's the first check, the one with the script_check call. If that is the case, check the scripts on the culprit. You may be missing a SCRIPT(x) = NULL after a free() somewhere.
Last edit: 06 Mar 2016 20:46 by thomas. Reason: corrected markup

Please Log in or Create an account to join the conversation.

More
20 Mar 2016 04:48 - 22 Mar 2016 22:02 #5631 by krell
Replied by krell on topic Crash on Shutdown
I don't seem to see any explicit free(SCRIPT(x)) of any sort in the code.

The SCRIPT macro expands to ((o)->script). I suppose I could look for free's for pointers similar to that.

Addendum:

I haven't found any free's, DISPOSE's or STRFREE's that have that pointer as a argument. Perhaps I'm missing something. The pointer isn't being manipulated in some function under a different variable name? Nothing obvious jumps out at me.
Last edit: 22 Mar 2016 22:02 by krell. Reason: updating progress

Please Log in or Create an account to join the conversation.

More
22 Apr 2016 09:13 #5787 by prool
Replied by prool on topic Crash on Shutdown
Hello, collegues!

Another crash on shutdown here. ("prooldebug" message is my debuging printf)
Code:
Apr 22 07:49:41 :: Boot db -- DONE. Apr 22 07:49:41 :: Signal trapping. Apr 22 07:49:41 :: Entering game loop. Apr 22 07:49:41 :: No connections. Going to sleep. Apr 22 09:19:12 :: New connection. Waking up. Apr 22 09:19:19 :: Prool has connected. Apr 22 09:19:21 :: Prool retrieving crash-saved items and entering game. Apr 22 09:19:21 :: Prool (level 34) has 1 object (max 30). Apr 22 09:24:12 :: nusage: 1 sockets connected, 1 sockets playing Apr 22 10:11:04 :: SYSERR: Missed 2798 seconds worth of pulses. Apr 22 10:12:20 :: (GC) Shutdown by Prool. Apr 22 10:12:20 :: Closing all sockets. Apr 22 10:12:20 0 :: Closing link to: Prool. Apr 22 10:12:20 :: Saving current MUD time. Apr 22 10:12:20 :: Normal termination of game. Apr 22 10:12:20 :: Clearing game world. prooldebug: locate follower who is not head of list prooldebug: locate follower who is not head of list Program received signal SIGSEGV, Segmentation fault. 0x00000000004acfcf in stop_follower (ch=ch@entry=0x1764cf0) at utils.c:572 572 { for (k = ch->master->followers; k->next->follower != ch; k = k->next); (gdb)

With best regards,

Prool

With best regards, Prool

Please Log in or Create an account to join the conversation.

More
27 Apr 2016 12:32 #5799 by prool
Replied by prool on topic Crash on Shutdown
Hello,

I using gdb and found shutdown crash in file utils.c in line:

{ for (k = ch->master->followers; k->next->follower != ch; k = k->next);

And I make very dirty hack for avoid crash:

if (circle_shutdown==1) return; // before this line

Prool

With best regards, Prool

Please Log in or Create an account to join the conversation.

More
06 May 2016 03:45 #5839 by krell
Replied by krell on topic Crash on Shutdown
I haven't had very much time this past month to look any further into this issue. If I recall correctly, isn't that a macro that's called on from the checks in comm.c? It only seems famliar because I was chasing down all of those macros by hand.

The question remains; why do one of the pointers point to a non-null address? Checking for the shutdown flag is alright, but it leaves me feeling a bit unsatisified.


When I have some more free time I'll try beating my head against that wall yet again. :-)

Please Log in or Create an account to join the conversation.

Time to create page: 0.207 seconds