Returning strings from C functions

homepage Forums BridgePoint/xtUML Usage and Training Returning strings from C functions

Viewing 12 posts - 1 through 12 (of 12 total)
  • Author
    Posts
  • #2388
    cort
    Keymaster

    The C Programming Language

    Kernighan and Ritchie… to get your minds thinking about translating xtUML to C and specifically about returning strings. A technical write-up (analysis note) will follow after the holiday.

    #2441
    cort
    Keymaster

    Consider the xtUML OAL (object action language) snippet below.

    name and logo are local string variables.
    buffer is a class.
    scmp and simple are operations which each return a string.
    scmp takes 2 string parameters.

    Notice how the return string from simple is being passed as a parameter into the second invocation of scmp.


    name = “BridgePoint”; logo = “xtUML”;

    name = buffer::scmp( s1:”action”, s2:”language” );

    logo = buffer::scmp( s1:logo, s2:buffer::simple() );

    • This reply was modified 5 years, 10 months ago by cort.
    • This reply was modified 5 years, 10 months ago by cort.
    #2444
    cort
    Keymaster

    The MC-3020 model compiler translates the above OAL into C.

    C does not have a native string type, so returning strings from functions provides challenges. Namely, a pointer can be returned, but data itself exists “behind the pointer”. This is usually O.K.

    However, consider the operations scmp and simple above. What if the strings returned were local variables formed and calculated inside the scope of the respective functions?

    In such a case, the return pointer will point to deallocated stack space. This is not good.

    To address this, an analysis has been performed outlining options for changing the MC-3020 model compiler. The analysis is documented here.

    https://github.com/cortlandstarrett/bridgepoint/blob/master/doc-bridgepoint/notes/589_stringtest/586_stringtest.ant.md

    Please read the above note and provide your thoughts here.

    #2445
    Lee Riemenschneider
    Participant

    My vote would be for the by reference method. Is there a case for mutiple ways, chosen by marking?

    #2451
    cort
    Keymaster

    @Lee, why? I want to see/hear objective reasoning and rationale.

    #2457
    keithbrown
    Keymaster

    Option 5.1.6 (Architectural by-ref param) is my favored choice.

    – It avoids re-entrancy problems that a static in the callee (or global scope) would have
    – It is a (if not the most) common way of handling this problem in C
    – It performs well by avoiding unnecessary copying of data

    #2461
    Lee Riemenschneider
    Participant

    I think the main reason I liked it is that it appears to give more flexibility over the span of the call. Consider that we have a pointer that is known by both the callee and the caller. The callee can resize the pointed to memory space, or can write within the original size. The callee can take the value pointed to, send it to another process, and write the returned value into the memory space. i.e., asynchronous handling of a synchronous call.

    Of course, there are some further questions raised. Do we need a size parameter, or will all strings be NULL terminated? Is there any danger of overwriting memory?

    #2605
    cort
    Keymaster

    A factor not documented in the analysis note is the commonality between C and C++.

    MC-3020 is both a C and C++ model compiler. The delta between the two model compiler dialects is (perhaps surprisingly) small. Of course C and C++ differ in syntax in some areas, but C can be made “to do things like C++” in most situations. And usually, “doing it like C++” is a Good Thing.

    I see potential value in returning strings “like C++ does it”.

    #3177
    Erik Wedin
    Participant

    I vote for the 5.1.6 Architectural By-Ref Parameter approach since it seems to eliminate all the problems in the other methods like static/shared variables and fixed size buffers etc. This is a design approach used successfully when we translated xtUML into Ada in our Ada MC.

    #3179
    Erik Wedin
    Participant

    It is also important to find an approach the static and run-time analysis tools accepts as erroneous free code.

    We are running continuos analysis of our generated code with “cppcheck” (http://cppcheck.sourceforge.net/) and currently the generated MC3020 triggers a large number of warnings/errors. Some that are valid and some that are not due to the xtUML semantics like instance selections via unconditional associations that would (should) not return a null pointer.

    We are using “Valgrind” to perform run-time checks and it flags the return of strings from functions as erroneous.

    #3188
    UltraDark
    Participant

    I’ve written a few model compilers over the years that translated to ANSI C and the Architectural By-Ref Parameter option was implemented in the vast majority of invocations.

    Its faster, more memory efficient, the string length does not have to predetermined in the callee and its thread safe.

    The amount of effort for implementation pals into insignificance considering the effort needed to create the model compiler in the first place and Once It’s Done It’s Done For Everyone Forever (TM).

    #3874
    cort
    Keymaster

    Thanks, everyone, for the comments. This was an interesting design.

    In this design note: https://github.com/xtuml/bridgepoint/blob/master/doc-bridgepoint/notes/589_stringtest/589_returnstring.dnt.md
    …you will read that we ended up going with the by-ref parameter design. Even though some may see the “xtuml_string” return design more natural, ANSI C and gcc are simply not quite ready for it.

    OIDIDFEF!

Viewing 12 posts - 1 through 12 (of 12 total)
  • You must be logged in to reply to this topic.