J. Clarke writes:
> On Mon, 4 Jan 2021 23:00:54 +0000, Pancho
> wrote:
>
>>On 04/01/2021 22:50, Dan Espen wrote:
>>> Pancho writes:
>>>
>>>> On 04/01/2021 17:51, gareth evans wrote:
>>>>> On 04/01/2021 13:08, Pancho wrote:
>>>>>> On 04/01/2021 11:00, gareth evans wrote:
>>>>>>> Thinking back to my first job, nearly 50 years ago now,
>>>>>>> when I had to dis-assemble DEC's paper tape BASIC
>>>>>>> interpreter in order to enhance it, I guess that
>>>>>>> dis-assemblers and decompilers must now be ten-a-penny,
>>>>>>> especially for programs running under Windows where
>>>>>>> the structure of Windows programs is well-known with
>>>>>>> an assumption that C was the source language?
>>>>>>>
>>>>>>> But I wonder if Artificial Intelligence could, after
>>>>>>> being fed with numerous instruction sets, take a
>>>>>>> block of binary, and analyse its source without
>>>>>>> any prior knowledge of the instruction set?
>>>>>>>
>>>>>>> I am particularly interested in the Binary Blob
>>>>>>> provided for Raspberry Pi computers, with a view to
>>>>>>> getting detailed knowledge of the video processors
>>>>>>> employed therein.
>>>>>>>
>>>>>> I think a lot of the problem is defining the question.
>>>>>>
>>>>>> What do you want it to do?
>>>>>>
>>>>> I don't want it to do anything. I want to play at a low level
>>>>> with the thing ... large oaks from little acorns grow.
>>>>>
>>>>
>>>> Play with what thing? What is an instruction set, what is the Binary
>>>> Blob? Why do you need an AI?
>>>>
>>>> Most compilers leave fingerprints on executables you don't need an AI
>>>> to detect them. I remember decompiling in the early 80's but complex
>>>> modern code can often be a challenge to naively reverse engineer a
>>>> high level understanding from even if you do have source code. Take
>>>> away sensible variable and function names and you are stuffed.
>>>
>>> I've had more than one experience in putting those meaningful variable
>>> names right back. It's actually pretty easy, a somewhat rote process.
>>> Find the read input instruction. Since you know the layout of the input
>>> record, you now have labels to many of the references to that input
>>> area.
>>>
>>> I think you can work out how to proceed.
>>>
>>>
>>Without the source how do you know any meaningful variable names in the
>>first place?
>
> You start with the inputs and outputs and work into the algorithms and
> eventually maybe you can make sense of it.
Yep.
One place I was working they had a lost source code program
reconstructed from object code and they were complaining no one
could work on it because of the variable and routine names.
Seemed easy enough to me and I fixed it up in a day or 2.
--
Dan Espen
--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | FidoUsenet Gateway (3:770/3)
|