Welcome to the Builder Academy

Question Is it possible these can be conflicting?

More
23 Sep 2012 05:54 #818 by zusuk
I guess this question is aimed at Vatiken since he's the author of the current event system.

Is it possible that there could be conflict (in stock behaviour) with:

handler.c
void extract_char(struct char_data *)
clear_char_event_list(ch);

and

comm.c
void event_process(void)

Occassionaly on mob deaths, and there does not seem to be consistent pattern, I get a crash and trace it...
Code:
Sep 22 22:52:41 :: SYSERR: Attempted to cancel a non-NULL unqueued event, freeing anyway *** glibc detected *** /home/luminari/tba/bin/circle: double free or corruption (fasttop): 0x092d14b8 *** ======= Backtrace: ========= /lib/libc.so.6[0x89b4a5] /lib/libc.so.6(cfree+0x59)[0x89b8e9] /home/luminari/tba/bin/circle[0x80d30dd] /home/luminari/tba/bin/circle[0x80c1cc1] /home/luminari/tba/bin/circle[0x80c4008] /home/luminari/tba/bin/circle[0x80c5be7] /lib/libc.so.6(__libc_start_main+0xdc)[0x847e9c] /home/luminari/tba/bin/circle[0x8049641] ======= Memory map: ======== 0073d000-0073e000 r-xp 0073d000 00:00 0 [vdso] 00813000-0082e000 r-xp 00000000 fd:01 10914822 /lib/ld-2.5.so 0082e000-0082f000 r--p 0001a000 fd:01 10914822 /lib/ld-2.5.so 0082f000-00830000 rw-p 0001b000 fd:01 10914822 /lib/ld-2.5.so 00832000-00986000 r-xp 00000000 fd:01 10914823 /lib/libc-2.5.so 00986000-00987000 ---p 00154000 fd:01 10914823 /lib/libc-2.5.so 00987000-00989000 r--p 00154000 fd:01 10914823 /lib/libc-2.5.so 00989000-0098a000 rw-p 00156000 fd:01 10914823 /lib/libc-2.5.so 0098a000-0098d000 rw-p 0098a000 00:00 0 00a54000-00a5d000 r-xp 00000000 fd:01 10914920 /lib/libcrypt-2.5.so 00a5d000-00a5e000 r--p 00008000 fd:01 10914920 /lib/libcrypt-2.5.so 00a5e000-00a5f000 rw-p 00009000 fd:01 10914920 /lib/libcrypt-2.5.so 00a5f000-00a86000 rw-p 00a5f000 00:00 0 00a93000-00a9e000 r-xp 00000000 fd:01 10914826 /lib/libgcc_s-4.1.2-20080825.so.1 00a9e000-00a9f000 rw-p 0000a000 fd:01 10914826 /lib/libgcc_s-4.1.2-20080825.so.1 08048000-081dc000 r-xp 00000000 fd:01 14159453 /home/luminari/tba/bin/circle 081dc000-081e1000 rw-p 00193000 fd:01 14159453 /home/luminari/tba/bin/circle 081e1000-092fb000 rw-p 081e1000 00:00 0 [heap] b7c19000-b7ff8000 rw-p b7c19000 00:00 0 b7fff000-b8000000 rw-p b7fff000 00:00 0 bff87000-bfffe000 rw-p bff86000 00:00 0 [stack] Program received signal SIGABRT, Aborted. 0x008137f2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb) list 195 #endif 196 197 t->tv_sec = (int) (millisec / 1000); 198 t->tv_usec = (millisec % 1000) * 1000; 199 } 200 201 #endif /* CIRCLE_WINDOWS || CIRCLE_MACINTOSH */ 202 203 int main(int argc, char **argv) 204 { (gdb) up #1 0x0085adf0 in raise () from /lib/libc.so.6 (gdb) list 205 int pos = 1; 206 const char *dir; 207 208 #ifdef MEMORY_DEBUG 209 zmalloc_init(); 210 #endif 211 212 #if CIRCLE_GNU_LIBC_MEMORY_TRACK 213 mtrace(); /* This must come before any use of malloc(). */ 214 #endif (gdb) up #2 0x0085c701 in abort () from /lib/libc.so.6 (gdb) list 215 216 #ifdef CIRCLE_MACINTOSH 217 /* ccommand() calls the command line/io redirection dialog box from 218 * Codewarriors's SIOUX library. */ 219 argc = ccommand(&argv); 220 /* Initialize the GUSI library calls. */ 221 GUSIDefaultSetup(); 222 #endif 223 224 /* Load the game configuration. We must load BEFORE we use any of the (gdb) up #3 0x0089318b in __libc_message () from /lib/libc.so.6 (gdb) list 225 * constants stored in constants.c. Otherwise, there will be no variables 226 * set to set the rest of the vars to, which will mean trouble --> Mythran */ 227 CONFIG_CONFFILE = NULL; 228 while ((pos < argc) && (*(argv[pos]) == '-')) { 229 if (*(argv[pos] + 1) == 'f') { 230 if (*(argv[pos] + 2)) 231 CONFIG_CONFFILE = argv[pos] + 2; 232 else if (++pos < argc) 233 CONFIG_CONFFILE = argv[pos]; 234 else { (gdb) up #4 0x0089b4a5 in _int_free () from /lib/libc.so.6 (gdb) list 235 puts("SYSERR: File name to read from expected after option -f."); 236 exit(1); 237 } 238 } 239 pos++; 240 } 241 pos = 1; 242 243 if (!CONFIG_CONFFILE) 244 CONFIG_CONFFILE = strdup(CONFIG_FILE); (gdb) up #5 0x0089b8e9 in free () from /lib/libc.so.6 (gdb) list 245 246 load_config(); 247 248 port = CONFIG_DFLT_PORT; 249 dir = CONFIG_DFLT_DIR; 250 251 while ((pos < argc) && (*(argv[pos]) == '-')) { 252 switch (*(argv[pos] + 1)) { 253 case 'f': 254 if (! *(argv[pos] + 2)) (gdb) up #6 0x080d30dd in event_process () at dg_event.c:133 133 free(the_event); (gdb) list 128 else { 129 if (the_event->isMudEvent && the_event->event_obj != NULL) 130 free_mud_event((struct mud_event_data *) the_event->event_obj); 131 /* It is assumed that the_event will already have freed ->event_obj. */ 132 // event_cancel(the_event); 133 free(the_event); 134 } 135 136 } 137 } (gdb) up #7 0x080c1cc1 in heartbeat (heart_pulse=1960) at comm.c:974 974 event_process(); (gdb) list 969 void heartbeat(int heart_pulse) 970 { 971 struct char_data *i; 972 static int mins_since_crashsave = 0; 973 974 event_process(); 975 976 if (!(heart_pulse % PULSE_DG_SCRIPT)) 977 script_trigger_check(); 978 (gdb) up #8 0x080c4008 in game_loop (local_mother_desc=7) at comm.c:940 940 heartbeat(++pulse); (gdb) list 935 missed_pulses = 30 RL_SEC; 936 } 937 938 /* Now execute the heartbeat functions */ 939 while (missed_pulses--) 940 heartbeat(++pulse); 941 942 if (reread_wizlist) { 943 reread_wizlist = FALSE; 944 mudlog(CMP, LVL_IMMORT, TRUE, "Signal received - rereading wizlists."); (gdb) up #9 0x080c5be7 in init_game (argc=Cannot access memory at address 0x27ef ) at comm.c:535 535 game_loop(mother_desc); (gdb) list 530 if (fCopyOver) /* reload players */ 531 copyover_recover(); 532 533 log("Entering game loop."); 534 535 game_loop(mother_desc); 536 537 Crash_save_all(); 538 539 log("Closing all sockets."); (gdb) up #10 main (argc=Cannot access memory at address 0x27ef ) at comm.c:354 354 init_game(port); (gdb) list 349 350 if (scheck) 351 boot_world(); 352 else { 353 log("Running game on port %d.", port); 354 init_game(port); 355 } 356 357 log("Clearing game world."); 358 destroy_db(); (gdb) up Initial frame selected; you cannot go up. (gdb)

Website
www.luminariMUD.com

Main Game Port
luminariMUD.com:4100

Please Log in or Create an account to join the conversation.

More
23 Sep 2012 06:01 #819 by zusuk
Oh yeah the reason I just didn't post the dump, and asked about the possible conflict, is because I went to handler.c -> extract_char and changed the code to look like this, which seemed to fix the crash bug, but makes me worry about possible 'other' issues resulting:
Code:
void extract_char(struct char_data *ch) { char_from_furniture(ch); if (!IS_NPC(ch)) /* ADDED THIS */ clear_char_event_list(ch); if (IS_NPC(ch)) SET_BIT_AR(MOB_FLAGS(ch), MOB_NOTDEADYET); else SET_BIT_AR(PLR_FLAGS(ch), PLR_NOTDEADYET); extractions_pending++; }

Website
www.luminariMUD.com

Main Game Port
luminariMUD.com:4100

Please Log in or Create an account to join the conversation.

More
23 Sep 2012 20:53 #822 by thomas
Strictly speaking, Vatiken isn't the author of the event system, as it was originally added back when DG_scripts were put in.

However, clear_char_event_list() wasn't in that version, so you might just be right.

So, apparently, there's a race condition here somewhere.
My guess is that the problem occurs like this:

1- Some mob script is triggered
2- the script runs, but has a "wait" somewhere in it, so it is enqueued in the event system
3- the mob dies (while the script is enqueued)
4- extract_char clears the event list for the mob (but doesn't remove the now free()'d event from the queue)
5- eventually the event is run
6- the event_func returns 0
7- the event_process() loop tries to free the event a second time.
8- BOOM!

Your fix works because you skip step 4. This is not a good idea. You'll bleed lots of memory like that.
The correct fix will be to also make sure that the event is removed from the queue. Or marked with a no-op-function that will make it harmlessly vanish in next event_process().

Please Log in or Create an account to join the conversation.

More
26 Sep 2012 00:52 #823 by Vatiken
1) The Event system was part of the original implementation of the DG scripting system, and the "Mud Event" system which is just an additional layer that utilizes the original system was created by myself for use for non-dg events (as of 3.63 primarily pausing the descriptor while waiting for protocol negotiation) and aside from sharing the same event_process() function, the 2 systems remain mostly separate.

2) If forcing only PC's to clear their event list is solving the crashing then the issue is most likely being caused by a "Mud Event" that is being freed but isn't being removed from the MOB's event list and thus upon death the mud is trying to free a "Mud Event" that doesn't exist.

I personally haven't experienced this issue myself, nor have I heard of it prior to this thread but as I expected there to be atleast some issues with the new event system I did my best to make it as easy to debug as possible. My advice would be to attempt to log which "Mud Event" is being cancelled and which NPC is clearing events at the time of the crash. My guess is there probably is some consistency in which NPC/event is causing the problem.

The actual process of the event system is quite simple:
1) Add "Mud Event"
2) Wait...
3) Run "Mud Event"
4) If timer reset.. go back step 2, else:
5) Remove "Mud Event" from list and free it.

If it's a character mud event, and the character dies, then we remove it from the queue and char list, then free it.

... Just peeking through the code right now and if the mud_event_index[] isn't set to the correct "Mud Event" type, it looks as if you might experience an issue like the one you are having.

tbaMUD developer/programmer

Please Log in or Create an account to join the conversation.

Time to create page: 0.191 seconds