Welcome to the Builder Academy

Question Confused over piece of code in parse_room()

More
26 Jan 2020 17:35 - 26 Jan 2020 18:12 #8505 by krell

thomas wrote: Yes, that's a needed fix for NULL from fread_string().


I see. So if I understand you correctly, and I'm comprehending strchr(3)

DESCRIPTION
The strchr() function locates the first occurrence of the character c
(converted to a char) in the string s. The terminating NUL character is
considered part of the string. If c is ‘\0’, strchr() locates the
terminating ‘\0’.

The index() function is an old synonym for strchr().

RETURN VALUES
The strchr() function returns a pointer to the located character or NULL
if the character does not appear in the string.


WARNING: MUCH VERBOSITY. TESTING MY COMPREHENSION


If strchr() finds "\0" in new_descr->description then *end is assigned the address of "\0", if it exists, otherwise *end is assigned "NULL" and the descr->description exits unmodified. But if "\0" is found at the end of the string in descr->description, and there is no "\n" then a new *end is mallocated that is 3 bits bigger than the original descr->description. The contents of descr->description are dumped into *end and "\r\n" are tacked onto the end of *end.

thomas wrote: And, in case someone needs to search for it, { } is called a scope.

It does something - it limits the scope of any variables to this part of the code. In C, you can only define new variables at the start of a scope. So this scope is here to allow us to create the end variable.


So this tells me that if the preceding code in that case statement is executed then the contents between the braces must be executed.

Which is why descr->description must be freed and reassigned to *end as the life of *end as a variable only exists within the scope defined by that set of braces, but descr->description is a variable exits in an external scope?

I have the feeling that you may have explained this to me at one time in the past thomas, possibly within the mud or to another forum poster, but thank you for your patience in doing so again.Thank you as well @Castillo in taking the time to reply to my post. You have both helped with improving my knowledge.

thomas wrote: It also means that this is a bit hacky. Had I made code like it at work, my colleagues might have stopped me and instead insisted on making me create a function for this scope. I'll see if I can have a look at improving this later.


Is it possible to improve fread_string() to avoid having to strip trailing "\0"'s in a separate scope, or would that unnecessarily complicate fread_string() or lead to unintended consequences?

Addendum:
Shouldn't *end be freed and nulled before exiting the subscope as it had been mallocated?
Last edit: 26 Jan 2020 18:12 by krell. Reason: 1) I had an additional question related to the topic 2) Spelling corrections

Please Log in or Create an account to join the conversation.

More
26 Jan 2020 19:30 #8506 by Castillo

It does something - it limits the scope of any variables to this part of the code. In C, you can only define new variables at the start of a scope. So this scope is here to allow us to create the end variable.


Oh yes, you are right it's limits the scope of the variable 'end'. Never thought, about it.

In C, you can only define new variables at the start of a scope.


but,
Code:
case 'E': CREATE(new_descr, struct extra_descr_data, 1); new_descr->keyword = fread_string(fl, buf2); new_descr->description = fread_string(fl, buf2); /* Fix for crashes in the editor when formatting. E-descs are assumed to * end with a \r\n. -Welcor */ // { char *end = strchr(new_descr->description, '\0'); if (end > new_descr->description && *(end-1) != '\n') { CREATE(end, char, strlen(new_descr->description)+3); sprintf(end, "%s\r\n", new_descr->description); /* snprintf ok : size checked above*/ free(new_descr->description); new_descr->description = end; } // } new_descr->next = world[room_nr].ex_description; world[room_nr].ex_description = new_descr; break;

If i remove your {} scope, it compile just fine. Shouldn't 'end' have a scope until break which seems fine for this piece of code, or is it non-standard 'C' doing so? Also, 'end' become permanent memory with CREATE, so this seems all good.

Bob

Please Log in or Create an account to join the conversation.

More
26 Jan 2020 20:22 - 26 Jan 2020 21:03 #8507 by Castillo

*end is mallocated that is 3 bits bigger than the original

3 bytes :)
(bits are for binary 01010010)

So this tells me that if the preceding code in that case statement is executed then the contents between the braces must be executed.

There is no condition on the braces, so yes the code is executed, if we enter the
case 'E' :

Which is why new_descr->description must be freed and reassigned to *end as the life of *end as a variable only exists within the scope defined by that set of braces, but new_descr->description is a variable exits in an external scope?


if (new_descr->description) is not NULL.

It contains permanant memory. coming from fread_string()
You should check strdup() function.

actually, it must be freed, because we want to replace that memory by a new content. Otherwise, it's a memory leak.

You must understand new_descr->description is a pointer to a string of fixed length.

The size of that string is exactly the length of that string... so, there is no space left to add more to it. It's not like a buffer, that if it's not full and you can add to it. HERE the buffer is already FULL.

The content of the pointer must be freed, and the pointer itself assign to a new pointer that indicate the location of a new buffer with the bigger string.

The bigger and new string created by CREATE(end, char, strlen(new_descr->description)+3);
is also permanent memory.
so, new_descr->description will point to a permanent memory string.
It's doesn't matter after that if the variable 'end' is destroyed.
new_descr->description knows that address of the string memory. And, that isn't destroy, neither the permanent memory.

Bob
Last edit: 26 Jan 2020 21:03 by Castillo.

Please Log in or Create an account to join the conversation.

More
26 Jan 2020 20:25 #8508 by thomas

If i remove your {} scope, it compile just fine. Shouldn't 'end' have a scope until break which seems fine for this piece of code, or is it non-standard 'C' doing so? Also, 'end' become permanent memory with CREATE, so this seems all good.


Yes, this is non-standard C ("C89"). In C99 (which in itself is a standard), the limitation on where you can declare at variable was lifted. It's still good practice to avoid "just adding" a variable in the middle of a function - mostly for readability, but also because the variable is not accessible above its declaration, and that can be confusing.

K&R writes: "Declarations of variables (including initializations) may follow the left brace that introduces any compound statement, not just the one that begins a function". But, as you have discovered, later versions changed this.

If strchr() finds "\0" in new_descr->description then *end is assigned the address of "\0", if it exists, otherwise *end is assigned "NULL" and the descr->description exits unmodified. But if "\0" is found at the end of the string in descr->description, and there is no "\n" then a new *end is mallocated that is 3 bits bigger than the original descr->description. The contents of descr->description are dumped into *end and "\r\n" are tacked onto the end of *end.


No, what is actually happening is that we're reusing the end variable.

First off - you need to understand that a string in C is "just a byte array". There's nothing fancy about it [3]. It contains chars by their ASCII byte value. To signify when the string ends, we put in a 0 byte. So the string "hi" will be represented by the char array [104, 105, 0]. The character notation in C uses single quotes and backslashes to signify the same thing: "hi" = 'h','i','\0' = '\104','\105','\0'
So all non-NULL strings contain at least one character - the 0 byte. Using the same annotation, the empty string ("") looks like this : '\0'.
Code:
CREATE(new_descr, struct extra_descr_data, 1); // create the struct new_descr->keyword = fread_string(fl, buf2); // read the keywords new_descr->description = fread_string(fl, buf2); // read the description /* Fix for crashes in the editor when formatting. E-descs are assumed to * end with a \r\n. -Welcor */ { // start a new scope // declare the variable end as a pointer to the end of the string. char *end = strchr(new_descr->description, '\0'); // check 1) that the string had length and 2) that it didn't end with \n if (end > new_descr->description && *(end-1) != '\n') { // [1] REUSE the end variable to point to a newly created empty piece of memory with enough room for \r\n\0 CREATE(end, char, strlen(new_descr->description)+3); // [2] write the contents of the loaded description into this new memory, with the \r\n\0 added again. sprintf(end, "%s\r\n", new_descr->description); /* snprintf ok : size checked above*/ free(new_descr->description); // free the memory allocated by fread_string() new_descr->description = end; // point the description to the newly allocated memory } } // break out of scope. The "end" variable no longer exists.

[1] Reusing is also hacky. As you a evidence of, it is clever, but confusing.
[2] Note - there is an invisible '\0' after the \r\n in this sprintf().
[3] Fancy strings in C is more trouble than their worth, by the way. If you want to read about it, look up wchar.

Please Log in or Create an account to join the conversation.

More
26 Jan 2020 21:04 #8511 by thomas
The following user(s) said Thank You: krell

Please Log in or Create an account to join the conversation.

More
26 Jan 2020 22:08 #8515 by krell

Castillo wrote: 3 bytes :)
(bits are for binary 01010010

Embarrassingly for me, I actually knew that. Oops.

thomas wrote: No, what is actually happening is that we're reusing the end variable.

First off - you need to understand that a string in C is "just a byte array". There's nothing fancy about it [3]. It contains chars by their ASCII byte value. To signify when the string ends, we put in a 0 byte. So the string "hi" will be represented by the char array [104, 105, 0]. The character notation in C uses single quotes and backslashes to signify the same thing: "hi" = 'h','i','\0' = '\104','\105','\0'
So all non-NULL strings contain at least one character - the 0 byte. Using the same annotation, the empty string ("") looks like this : '\0'.

Further embarrassment. I actually do know that strings in C all contain the null character. My apologies for not thinking that part through more carefully.

Code:
} } // break out of scope. The "end" variable no longer exists.


And also

Castillo wrote: The bigger and new string created by CREATE(end, char, strlen(new_descr->description)+3);
is also permanent memory.
so, new_descr->description will point to a permanent memory string.
It's doesn't matter after that if the variable 'end' is destroyed.
new_descr->description knows that address of the string memory. And, that isn't destroy, neither the permanent memory.


If I understand correctly, because the original *end and mallocated *end have the same address as new_descr->description, plus *end being defined within this scope, means that *end disappears without having to free and null it.

thomas wrote: [1] Reusing is also hacky. As you a evidence of,


Well, I do try to be useful.

Please Log in or Create an account to join the conversation.

Time to create page: 0.228 seconds