This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: GSoC :Project Idea(Before final Submission) for review and feedback



-----Original Message-----
From: Subrata Biswas [mailto:subrata.iitr@gmail.com] 
Sent: Sunday, March 25, 2012 12:22 PM
To: Oleg Endo
Cc: gcc
Subject: Re: GSoC :Project Idea(Before final Submission) for review and feedback

Thank you sir for your excellent example.

On 25 March 2012 15:25, Oleg Endo <oleg.endo@t-online.de> wrote:
> Please reply in CC to the GCC mailing list, so others can follow the 
> discussion.
>
> On Sun, 2012-03-25 at 09:21 +0530, Subrata Biswas wrote:
>> On 25 March 2012 03:59, Oleg Endo <oleg.endo@t-online.de> wrote:
>> >
>> > I might be misunderstanding the idea...
>> > Let's assume you've got a program that doesn't compile, and you 
>> > leave out those erroneous blocks to enforce successful compilation 
>> > of the broken program. ?How are you going to figure out for which 
>> > blocks it is actually safe to be removed and for which it isn't?
>>
>> I can do it by tracing the code blocks which are dependent on the 
>> erroneous block. i.e if any block is data/control dependent(the 
>> output or written value of the erroneous part is read) on this 
>> erroneous block or line of code will be eliminated.
>>
>> > Effectively, you'll
>> > be changing the original semantics of a program, and those semantic 
>> > changes might be completely not what the programmer originally had 
>> > in mind. ?In the worst case, something might end up with an 
>> > (un)formatted
>> > harddisk...*
>> >
>> > Cheers,
>> > Oleg
>> >
>> Thank you sir for your great feedback. You have understood it 
>> correctly. Now the programmer will be informed about the change in 
>> code and the semantics.(Notice that this plug-in is not going to 
>> modify the original code!, it just copy the original code and perform 
>> all the operations on the temporary file!!!) Even from the partial 
>> execution of the code the programmer will get an overview of his 
>> actual progress.
>>
>> suppose the program written by the programmer be:
>>
>> 1 int main(void)
>> 2 {
>> 3 ? ?int arr[]={3,4,-10,22,33,37,11};
>> 4 ? ?sort(arr);
>> 5 ? ?int a = arr[3] // Now suppose the programmer missed the 
>> semicolon here. Which generates a compilation error at line 5;
>> 6 ? ?printf("%d\n",a);
>> 7 ? ?for(i=0;i<7;i++)
>> 8 ? ?{
>> 9 ? ? ? ?printf("%d\n",arr[i]);
>> 10 ? ?}
>> 11 ?}
>>
>>
>> Now if we just analyze the data (i.e. variable), we can easily find 
>> that there is only data dependency exists between line 5 and line 6.
>> The rest of the program is not being effected due to elimination or 
>> commenting line 5.
>>
>> Hence the temporary source file after commenting out the erroneous 
>> part of the code and the code segment that is dependent on this 
>> erroneous ?part would be:
>>
>> 1 int main(void)
>> 2 {
>> 3 ? ?int arr[]={3,4,-10,22,33,37,11};
>> 4 ? ?sort(arr);
>> 5 ? ?//int a = arr[3] // Now suppose the programmer missed the 
>> semicolon here. Which generates a compilation error at line 5;
>> 6 ? // printf("%d\n",a);
>> 7 ? ?for(i=0;i<7;i++)
>> 8 ? ?{
>> 9 ? ? ? ?printf("%d\n",arr[i]);
>> 10 ? ?}
>> 11 ?}
>>
>> Now this part of the program(broken program) is error free. Now we 
>> can compile this part using GCC and get the partial executable.
>>
>> Now the possible output after compilation using this plug in(if 
>> programmer use it) with GCC would be:
>>
>> "You have syntax error at Line no. 5. and to generate the partial 
>> executable Line 5 and Line 6 have removed in the temporary executable 
>> execute the partial executable excute p.out"
>>
>> Advantages to the Programmer:
>> 1. If programmer can see the result of the partial executable he can 
>> actually quantify his/her progress in code.
>> 2. The debug become easier as this plug-in would suggest about 
>> possible correction in the code etc.
>
> I don't think it will make the actual debugging task easier. ?It might 
> make writing code easier (that's what IDEs are doing these days while 
> you're typing code...). ?In order to debug a program, the actual bugs 
> need to be _in_ the program, otherwise there is nothing to debug.
> Removing arbitrary parts of the program could potentially introduce 
> new artificial bugs, just because of a missing semicolon.
>
>> * I did not understand the ?worst case that you have mentioned as 
>> (un)formatted hard disk. Can you kindly explain it?
>>
>
> Let's say I'm writing a kind of disk utility that reads and writes 
> sectors...
>
> ---------------------
> source1.c:
>
> bool
> copy_sector (void* outbuf, const void* inbuf, int bytecount) {
> ?if (bytecount < 4)
> ? ?return false;
>
> ?if ((bytecount & 3) != 0)
> ? ?return false;
>
> ?int* out_ptr = (int*)outbuf;
> ?const int* in_ptr = (const int*)inbuf;
> ?int count = bytecount / 4;
>
> ?do
> ?{
> ? ?int i = *in_ptr++;
> ? ?if (i & 1)
> ? ? ?i = do_something_special0 (i);
> ? ?else if (i & (1 << 16))
> ? ? ?i = do_something_special1 (i);
> ? ?*out_ptr++ = i;
> ?} while (--count);
>
> ?return true;
> }
>
> ---------------------
> source0.c:
>
> int main (void)
> {
> ?...
> ?int sector_size = get_sector_size (...);
> ?void* sector_read_buf = malloc (sector_size);
> ?void* sector_write_buf = malloc (sector_size);
>
> ?while (sector_count > 0)
> ?{
> ? ?read_next_sector (sector_read_buf);
> ? ?if (copy_sector (sector_write_buf, sector_read_buf, sector_size))
> ? ? ?write_next_sector (sector_write_buf);
> ?}
> ?...
> }
>
>
> Let's assume that in the function copy_sector in source1.c there is a 
> syntax error:
>
> ?do
> ?{
> ? ?int i = *in_ptr++;
> ? ?if (i & 1)
> ? ? ?i = do_something_special0 (i);
> ? ?else if (i & (1 << 16))
> ? ? ?i = do_something_special1 (i);
>
> ? ?*outptr++ = i; ?// misspelled 'out_ptr'.
> ? ? ? ? ? ? ? ? ? ?// There is no such variable 'outptr'.
> ? ? ? ? ? ? ? ? ? ?// This line will be left out to make it compile.
>
> ?} while (--count);
>
>
> If this broken program is executed it will happily transform data into 
> garbage.
>
>
> Another example could be (copy_sector function again):
>
> ?if (bytcount < 4) ?// syntax error again, if block is removed
> ? ?return false; ? ?// to enforce compilation.
>
> Now the copy_sector function will happily accept values <= 0 for 
> 'bytecount', which will most likely end up in an integer overflow and 
> a page fault...
>
> Those might be overly extreme and/or silly examples. ?What I'm trying 
> to say is that by leaving out program parts, there is a risk of 
> introducing artificial data corruption or artificial infinite loops 
> that accidentally overwrite data.
>
> Cheers,
> Oleg
>

I think I have understood your point. It is a nice example. Now, will it be feasible to implement my concept by classifying the programming bug (into some categories) created by the programmers and perform bug specific removing/replacement approach to generate the partial executable? I think it is(may be) possible and I have to study a bit to classify the bugs and finding out the algorithm to get rid of it to generate the partial executable. Can you suggest me an optimal approach so that it can be done(considering that it would not lead to a halting problem)?


Hi Subrata,
	One problem I find your idea (this is along the same lines are Oleg) is that having partial compilation will "sweep some errors under the rug." For a large executable, with several different control flow (e.g. the gcc compiler itself), having partial compilation will not expose the bug. The user will think his/her software works fine but the reality is that there is a bug that is undiscovered. If the user is not careful, (s)he may ignore the warning message which can be very critical in future.

	One thing I would suggest to minimize this problem would be to have a set of pragmas or attributes that can indicate the importance of a certain part of the code (it could be a variable or function or a code-block). For example, a certain variable or code-segment only for debugging or collecting data that you won't be using much can be marked as low. While, the main routines of the program should be marked as high.

Thanks,

Balaji V. Iyer.



--
Thanking You,

Regards
Subrata Biswas
MTech (pursuing)
Computer Science and Engineering
Indian Institute of Technology, Roorkee
Mob: +91 7417474559


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]