[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.javascript

Motivation behind the web application viewer (WAVI) project

bakunin95

2/22/2015 12:56:00 AM

Web applications pose unique challenges when it comes to understanding and maintaining their heterogeneous structures, which often involve complex interactions between elements from different languages. Accurate and up-to-date documentation is rarely available and this calls for the proposal of reverse engineering approaches for the recovery and representation of such structures. The proposed blog presents our ongoing work on Web Application Viewer (WAVI), a tool able to reverse engineer a web application's structure.

https://blogwavi.word...
10 Answers

Thomas 'PointedEars' Lahn

2/22/2015 11:40:00 AM

0

bakunin95 wrote:
^^^^^^^^^
Please post to Usenet using your real name (and preferably not using the
borked, spam-injecting, troll-infested Google Groups).

> Web applications pose unique challenges when it comes to understanding and
> maintaining their heterogeneous structures, which often involve complex
> interactions between elements from different languages. Accurate and
> up-to-date documentation is rarely available and this calls for the
> proposal of reverse engineering approaches for the recovery and
> representation of such structures. The proposed blog presents our ongoing
> work on Web Application Viewer (WAVI), a tool able to reverse engineer a
> web application's structure.
>
> https://blogwavi.word...

There are no classes in these languages, and they use dynamic type-checking,
which is *also* why your class diagrams are wrong.

First of all, they have nothing to do with what class diagrams in UML (2)
should look like. For example,

,----------------------------------------------------------.
| «JavaScript» |
| react/core/ReactInstanceHandles.js |
|----------------------------------------------------------|
| string SEPARATOR: |
| number MAX_TREE_DEPTH:100 |
|----------------------------------------------------------|
| getFirstCommonAncestorID(oneID,twoID) |
| getParentID(id) |
| isAncestorIDOf(ancestorID,destinationID) |
| getReactRootIDString(index) |
| traverseParentPath(start,stop,cb,arg,skipFirst,skipLast) |
| isBoundary(id,index) |
| isValid(id) |
`----------------------------------------------------------'

should be at least

,----------------------------------------------------------------.
| ReactInstanceHandles¹ |
|----------------------------------------------------------------|
| +MAX_TREE_DEPTH : number = 100 |
| +SEPARATOR : string |
|----------------------------------------------------------------|
| +getFirstCommonAncestorID(oneID, twoID) |
| +getParentID(id) |
| +getReactRootIDString(index) |
| +isAncestorIDOf(ancestorID, destinationID) |
| +isBoundary(id,index) |
| +isValid(id) |
| +traverseParentPath(start, stop, cb, arg, skipFirst, skipLast) |
`----------------------------------------------------------------'

From top to bottom:

AFAIK, the text in guillemets (or double less-than and greater-than signs,
for compatibility) is to indicate either whether the class is an interface,
is abstract, is a non-abstract class that can be instantiated, or which
function (like «Controller») a class has in the application framework. I am
not aware of a notation in which the programming language features in a
class diagram; so if you think it is worthwhile to show that, you should
find another way that does not conflict with UML. (But consider that
â??JavaScriptâ? is still imprecise.)

Properties and methods should be sorted by visibility, then by identifier,
each to ease finding them in the diagram; source code order has to be
irrelevant. Therefore, the type, if any, (and initial value, if a
property), *follows* the identifier. Property and method visibility should
be indicated by prefixes to the identifier (here: â??+â? for public).

Further differentiation whether a property or method is inherited or not (a
distinction that does not exist in class-based inheritance, therefore not a
notation defined in standard UML class diagrams, AFAIK) would be agreeable.

Using space to separate formal parameters in parameter lists increases
readability (in code as well as diagrams). You do not have to specify the
entire parameter list: you can use ellipsis (â??â?¦â?), or three dots for
compatibility, to show that it is too long for the diagram, therefore
incompletely represented. [That said, functions and methods with more than
a handful of formal parameters each are usually a design mistake; functions
and methods can be passed an argument of an object type whose properties
provide the values instead, so that parameter order is no longer and issue.]

Also, it is wrong to count local variables as properties per se:

,------------------------------------.
| «JavaScript» |
| react/utils/PooledClass.js |
|------------------------------------|
| number DEFAULT_POOL_SIZE: 10 |
| var DEFAULT_POOLER |
|------------------------------------|
| fiveArgumentPooler(a1,a2,a3,a4,a5) |
| [â?¦] |
`------------------------------------'

should be at least:

,-------------------------------------.
| PooledClass¹ |
|-------------------------------------|
| +DEFAULT_POOL_SIZE : number = 10 |
| -DEFAULT_POOLER : number |
|-------------------------------------|
| +addPoolingTo(CopyConstructor) |
| [â?¦] |
`-------------------------------------'

First of all, â??varâ? is _not_ a type such as â??stringâ?. Variables and
properties in ECMAScript implementations, except that of the Edition 4 draft
if you declare their type, do not have a specific type at compile time
(which is when code analyzers run) except when they are explicitly
initialized with a literal that has no references in it. The languages in
question use dynamic type-checking (sometimes mislabeled â??loose typingâ?),
which means that the type of a variable or property is determined *at
runtime from the current value* (if there never was an explicit assignment,
the variableâ??s value is â??undefinedâ?, the sole value of the Undefined type).

Now, you *could* assume that the first value explicitly assigned to a
variable or property is its type (and usually that is done by code
analyzers), but it does not mean that it cannot assume a value of a
different type. If necessary, implicit type conversion is performed in
operations, and property setters can modify the type of a value assigned to
a property if the assigned valueâ??s type does not fit the data model.

Second, you *could* consider a local variable a *private* property (hence
the leading â??-â? here) *iff* there was a publicly accessible method that
accessed it. Such a method would be considered â??privilegedâ? because only it
has access to that variable as a closure via the scope chain. On the other
hand, local variables are never visible to the caller; they are an
implementation detail that does not usually feature in class diagrams.
There should be at least an option in the generator that needs to be enabled
explicitly for them to be displayed at all.

Third, there are inconsistencies like

| string PREVENT_DURATION:string
| string tapping:false

This should be at least

| +PREVENT_DURATION : string
| +tapping : string = false

but you can see that this is still wrong because the type specification and
the assigned value does not match (the type of â??falseâ? is boolean, not
string), which is because of dynamic type-checking. Either you need to
decide which value is the one that determines the type of the property (I
suggest the first one, but that is _not_ necessarily in source code order),
or if there are multiple possibilities you can list each detected type, or
you need to omit the type specification.

This description of mistakes is not meant to be exhaustive. While I
understand your motivation and commend your efforts, I must strongly
recommend that you read the FAQ and the references therein, and this
newsgroup, and learn the language*s* from the ground up; then carefully read
<https://en.wikipedia.org/wiki/Class_d... and the references therein,
and start over. I find it highly unlikely that the diagrams as they are
generated now could be useful to anyone.

_______
¹ These are educated guesses; it is not obvious whether the filename has
anything to do with a custom object type defined in the file; it could
(although usually should not) just be a collection of global variables
with global functions. There is no built-in concept of namespaces in
these languages.
--
PointedEars
FAQ: <http://PointedEars.... | SVN: <http://PointedEars.de...
Twitter: @PointedEars2 | ES Matrix: <http://PointedEars.de/es-...
Please do not cc me. / Bitte keine Kopien per E-Mail

bakunin95

2/22/2015 2:37:00 PM

0

Hi Thomas, thank you for your post, I appreciate you take the time to comment.

First of all, i agree, i need to respect the UML standard more like where i put the type, the visibility and the ordering. I admit i did not put much effort into that and it show. The good news is that the hard work is to extract all those info from the source code, not to show them so most of it should be fixed by the end of the week.

I intend to generate a diagram of a web application and all its components. I want to support javascript/nodejs,css,html,java. The problem I have is that UML is not meant to show web application and all these languages components except for java so i use «stereotypes» and color to extend UML and to show what is the
type of the file. In this React framework example, you only see JavaScript file.

JavaScript is definitely the most difficult one to show in UML, i know i will never have a perfect representation but at least i will try.
The fact that there are no class yet and that there are so many people that use this language differently pose problem. For example, i cannot tell for sure that ReactInstanceHandles is meant to be a Class or if its simply a standalone script. I cannot tell in what environment they will be run, is it a nodejs module for the server ? or is it a javaScript file for the client..
What i found out with my project is that what would be a small class in Java end up being an Object inside a JavaScript file. This is why i extract that information. As for Array, im still unsure if they are
interesting components to extract or not.

The visibility is also quite tricky, they are not explicitly shown in Abstract Syntax Tree, i will have to do it manually if its not too complicated, that is why its not already done. However its very high on my priority list so i thank you for your comment about that.

Again thank you, if there is anything else i can do to make my project better, feel free to share.

Thomas 'PointedEars' Lahn

2/22/2015 6:30:00 PM

0

bakunin95 wrote:

> thank you for your post, I appreciate you take the time to comment.

You are welcome.

> I intend to generate a diagram of a web application and all its
> components.

The least you should do then is to allow the user to limit the level of
inspection. A Web developer (like me) should know the libraries and
frameworks they are using rather well, and there is API documentation for
those to find out; in a class diagram or component diagram I suppose they,
too, are more interested in the structure of *their own* code (that uses
libraries and frameworks, if necessary).

> I want to support javascript/nodejs,css,html,java.

There is no â??javascriptâ?. This is the first thing that you need to accept
and understand if you want to be successful there. You are dealing with
implementations of ECMAScript: a *family* of programming languages, with a
similar but different syntax and feature set each â?? at first because the
standard attempted to reconcile differences between existing implementations
(Netscape JavaScript and Microsoft JScript), now because the standard allows
extensions for conforming implementations.

Node.js is not a programming language, it is an application framework that
uses Google V8 JavaScript, a particular implementation of ECMAScript Edition
5+.

CSS has classes, but only as formatting hooks: it is a stylesheet language,
not a programming language. I cannot imagine a class diagram of a
stylesheet.

HTML has a â??classâ? attribute but it is a markup language, not a programming
language. The only way in which I can see the inspection of HTML as useful
there is to determine client-side scripts, whether inline or referenced (see
below).

_Java_ *is* a programming language, and it uses class-based inheritance, so
it is a suitable target. However, there are UML tools for it already â?? so
much so that you can take Java code, generate an UML class diagram from it,
and edit the class diagram where those changes are reflected in the Java
code. So you should ask yourself whether you should invest the time to
reinvent the wheel.

ISTM that Java as a programming language for client-side *Web* applications
is on the way out as the Netscape Plugin API (NPAPI) require for applets is
not (well-)supported by modern browsers anymore; but it is used for
standalone client applications and server-side code of Web applications.

> The problem I have is that UML is not meant to show web application and
> all these languages components except for java so i use «stereotypes» and
> color to extend UML and to show what is the type of the file. In this
> React framework example, you only see JavaScript file.

Rather, UML *class* diagrams are not meant to represent the structure of
code written in a *prototype*-based programming language. However, it is
possible to model a prototype object as a class *for the purpose of showing
inheritance*. Consider, for example (based on ECMAScript, 5.1 Edition):

,-----------------------------------------------------------.
| Object.prototype |
|-----------------------------------------------------------|
| +constructor : Function = Object |
|-----------------------------------------------------------|
| +hasOwnProperty(v: String¹) : boolean |
| +isPrototypeOf(v: Object) : boolean |
| +propertyIsEnumerable(v: String) : boolean |
| +toLocaleString() : string¹ |
| +toString() : string |
| +valueOf() : any² |
`-----------------------------------------------------------'
^
:
:
,-----------------------------------------------------------.
| Array.prototype |
|-----------------------------------------------------------|
| +constructor : Function = Array |
|-----------------------------------------------------------|
| +concat(â?¦) : Array |
| +every(callbackfn: Function, thisArg: Object) : boolean |
| +filter(callbackfn: Function, thisArg: Object) : Array |
| +forEach(callbackfn: Function, thisArg: Object) : boolean |
| +indexOf(searchElement, fromIndex: int³) : int |
| +join(separator) : string |
| +lastIndexOf(searchElement, fromIndex : int) : int |
| +map(callbackfn: Function, thisArg: Object) : Array |
| +pop() : any |
| +push(â?¦) : int |
| +reduce(callbackfn: Function, initialValue) : any |
| +reduceRight(callbackfn: Function, initialValue) : any |
| +reverse() : Array |
| +shift() : any |
| +slice() : Array |
| +some(callbackfn: Function, thisArg: Object) : boolean |
| +sort(comparefn: Function) : Array |
| +splice(start: int, deleteCount: int, item1, â?¦) : Array |
| +toLocaleString() : string |
| +toString() : string |
| +unshift(item1, â?¦) : int |
`-----------------------------------------------------------'

_________
I am using several conventions for argument and return types here that
I have found useful (and that are used in the ECMAScript Support Matrix
[â??ES Matrixâ?; see sig], with a more detailed explanation to come about
the [rather complex] type system of ECMAScript implementations):

¹ â??Stringâ? means that you can pass both a primitive string value and a
reference to an object that inherits from String.prototype, or that the
passed value is converted to that; â??stringâ? means that the value has to be
a primitive string value. Likewise for other types that allow for
primitive values.
² There is no â??anyâ? type; â??anyâ? indicates that a value of any type may
be passed or returned.
³ â??intâ? means that a value of type â??intâ? is expected if such a type is
supported, and a value of type Number with decimal fractional part 0
otherwise.

> JavaScript is definitely the most difficult one to show in UML, [â?¦]. The
> fact that there are no class yet and that there are so many people that
> use this language differently pose problem.

The main problem is that so many people think it is only one language, the
same everywhere.

There are classes already in implementations of the abandoned ECMAScript
Edition 4 Working Draft, for example in Microsoft JScript .NET and
Macromedia/Adobe ActionScript 2.0+.

There are going to be classes in a fashion different from that in
implementations of the upcoming ECMAScript Edition 6. Probably Mozilla
JavaScript and Google V8 JavaScript will be the first to support that.

People use these language*s* so differently because they are so flexible;
that is a, if not *the*, strength of them.

> For example, i cannot tell for sure that ReactInstanceHandles is meant to
> be a Class or if its simply a standalone script.

That depends on how people think of a â??classâ? in a language that has no
classes. If they think about it in terms of something that can be
instantiated using the â??newâ? keyword, then you can look for that: an
identifier or property accessor that is used with â??newâ? is meant to refer to
a constructor. One can consider a constructor a required part for a user-
defined object type (that which most closely resembles what is a class with
class-based inheritance).

> I cannot tell in what environment they will be run, is
> it a nodejs module for the server ?

ISTM that a telltale signs of a Node.js module is the occurrence of a
standalone â??module.exportsâ? in a source file.

BTW, Node.js is not tied to server-side applications. (Where I work we are
using it exclusively "client-side", in build scripts.)

> or is it a javaScript file for the client.

A possibility is to look for â??scriptâ? elements referring to the file as a
resource in (X)HTML markup. But scripts can be combined, there is content
negotiation, and they can be loaded dynamically as well.

> What i found out with my project is that what would be a small
> class in Java end up being an Object inside a JavaScript file.

I do not understand this statement. Object instances exist in memory, not
in source files.

> This is why i extract that information. As for Array, im still unsure if
> they are interesting components to extract or not.

Native object types should only be inspected per user preference.

> The visibility is also quite tricky, they are not explicitly shown in
> Abstract Syntax Tree,

It is if you make function declarations and expressions, thereby execution
contexts, nodes in the tree.

> [â?¦] if there is anything else i can do to make my project better, feel
> free to share.

[x] done

--
PointedEars
FAQ: <http://PointedEars.... | SVN: <http://PointedEars.de...
Twitter: @PointedEars2 | ES Matrix: <http://PointedEars.de/es-...
Please do not cc me. / Bitte keine Kopien per E-Mail.

John Harris

2/23/2015 10:20:00 AM

0

On Sun, 22 Feb 2015 19:29:31 +0100, Thomas 'PointedEars' Lahn
<PointedEars@web.de> wrote:

<snip>
>There is no ?javascript?.
<snip>

Concerning 'car' :
Some small passenger vehicles have 4 road wheels. Some have 3 road
wheels. One even had 2 road wheels; it used a dirty great gyro to
keep it upright. There are significant differences between them.
Therefore, when someone tells you they came to work by car today you
must shout
THERE IS NO CAR!

What goes for 'car' goes for 'javascript'.

John

Thomas 'PointedEars' Lahn

2/23/2015 2:17:00 PM

0

John Harris wrote:

> Thomas 'PointedEars' Lahn wrote:
> <snip>
>>There is no Â?javascriptÂ?.
> <snip>
>
> Concerning 'car' :
> Some small passenger vehicles have 4 road wheels. Some have 3 road
> wheels. One even had 2 road wheels; it used a dirty great gyro to
> keep it upright. There are significant differences between them.
> Therefore, when someone tells you they came to work by car today you
> must shout
> THERE IS NO CAR!
>
> What goes for 'car' goes for 'javascript'.

If you were not that â?¦ unobservant, you would have realized that this
sentence had a context, and that it is important for writing a code analyzer
that the syntax of different implementations of ECMAScript, including those
of different *Editions* of ECMAScript, is indeed *different*. (And to
understand *why* that is so.)

--
PointedEars
FAQ: <http://PointedEars.... | SVN: <http://PointedEars.de...
Twitter: @PointedEars2 | ES Matrix: <http://PointedEars.de/es-...
Please do not cc me. / Bitte keine Kopien per E-Mail.

John Harris

2/24/2015 9:58:00 AM

0

On Mon, 23 Feb 2015 15:17:12 +0100, Thomas 'PointedEars' Lahn
<PointedEars@web.de> wrote:

>John Harris wrote:
>
>> Thomas 'PointedEars' Lahn wrote:
>> <snip>
>>>There is no ?javascript?.
>> <snip>
>>
>> Concerning 'car' :
>> Some small passenger vehicles have 4 road wheels. Some have 3 road
>> wheels. One even had 2 road wheels; it used a dirty great gyro to
>> keep it upright. There are significant differences between them.
>> Therefore, when someone tells you they came to work by car today you
>> must shout
>> THERE IS NO CAR!
>>
>> What goes for 'car' goes for 'javascript'.
>
>If you were not that ? unobservant, you would have realized that this
>sentence had a context, and that it is important for writing a code analyzer
>that the syntax of different implementations of ECMAScript, including those
>of different *Editions* of ECMAScript, is indeed *different*. (And to
>understand *why* that is so.)

In that case why write "There is no javascript"? Why not be more
informative and helpful by writing
"Remember that 'javascript' names a family of languages " ...

John

Thomas 'PointedEars' Lahn

2/26/2015 10:30:00 AM

0

John Harris wrote:

> [â?¦] Thomas 'PointedEars' Lahn [â?¦] wrote:
>> John Harris wrote:
>>> Thomas 'PointedEars' Lahn wrote:
>>> <snip>
>>>>There is no ?javascript?.
>>> <snip>
>>>
>>> [â?¦]
>>> Therefore, when someone tells you they came to work by car today you
>>> must shout
>>> THERE IS NO CAR!
>>>
>>> What goes for 'car' goes for 'javascript'.
>>
>>If you were not that Â? unobservant, you would have realized that this
^^^^^^^^^^^
>>sentence had a context, and that it is important for writing a code
>>analyzer that the syntax of different implementations of ECMAScript,
>>including those of different *Editions* of ECMAScript, is indeed
>>*different*. (And to understand *why* that is so.)
>
> In that case why write "There is no javascript"? Why not be more
> informative and helpful by writing
> "Remember that 'javascript' names a family of languages " ...

Because that would be confirming a misconception.

As regards â??family of languagesâ?: QED.

--
PointedEars
FAQ: <http://PointedEars.... | SVN: <http://PointedEars.de...
Twitter: @PointedEars2 | ES Matrix: <http://PointedEars.de/es-...
Please do not cc me. / Bitte keine Kopien per E-Mail.

John Harris

2/26/2015 7:27:00 PM

0

On Thu, 26 Feb 2015 11:29:47 +0100, Thomas 'PointedEars' Lahn
<PointedEars@web.de> wrote:

>John Harris wrote:
>

<snip>
>> In that case why write "There is no javascript"? Why not be more
>> informative and helpful by writing
>> "Remember that 'javascript' names a family of languages " ...
>
>Because that would be confirming a misconception.
>
>As regards ?family of languages?: QED.

But you, Thomas, are the only person in the world who thinks it's a
misconception. Its use as a family name has been displayed in the FAQ
for 15 years or so. It's time you thought about it sensibly instead of
thinking it's a single language name.

John

Thomas 'PointedEars' Lahn

2/27/2015 4:48:00 AM

0

John Harris wrote:

> On Thu, 26 Feb 2015 11:29:47 +0100, Thomas 'PointedEars' Lahn
> <PointedEars@web.de> wrote:
>>John Harris wrote:
> <snip>
>>> In that case why write "There is no javascript"? Why not be more
>>> informative and helpful by writing
>>> "Remember that 'javascript' names a family of languages " ...
>>
>>Because that would be confirming a misconception.
>>
>>As regards Â?family of languagesÂ?: QED.
>
> But you, Thomas, are the only person in the world who thinks it's a
> misconception.

Claiming something does not make it so. Neither does artistic snipping.

> Its use as a family name has been displayed in the FAQ
> for 15 years or so.

A misconception by few underinformed people, to be corrected.

> It's time you thought about it sensibly instead of
> thinking it's a single language name.

Itâ??s *past* time that you posted something different from this constant
bickering, something constructive, to this newsgroup, like source code,
or finally FOAD'd.

--
PointedEars
FAQ: <http://PointedEars.... | SVN: <http://PointedEars.de...
Twitter: @PointedEars2 | ES Matrix: <http://PointedEars.de/es-...
Please do not cc me. / Bitte keine Kopien per E-Mail.

John Harris

2/27/2015 10:26:00 AM

0

On Fri, 27 Feb 2015 05:48:04 +0100, Thomas 'PointedEars' Lahn
<PointedEars@web.de> wrote:

>John Harris wrote:
>
>> On Thu, 26 Feb 2015 11:29:47 +0100, Thomas 'PointedEars' Lahn
>> <PointedEars@web.de> wrote:
>>>John Harris wrote:
>> <snip>
>>>> In that case why write "There is no javascript"? Why not be more
>>>> informative and helpful by writing
>>>> "Remember that 'javascript' names a family of languages " ...
>>>
>>>Because that would be confirming a misconception.
>>>
>>>As regards ?family of languages?: QED.
>>
>> But you, Thomas, are the only person in the world who thinks it's a
>> misconception.
>
>Claiming something does not make it so.

But there's evidence : a lack of other people claiming it's a
misconception.


>Neither does artistic snipping.

The FAQ says :
Quote only relevant parts of earlier messages, and add your comments
below each quoted section (FYI28/RFC1855).



>> Its use as a family name has been displayed in the FAQ
>> for 15 years or so.
>
>A misconception by few underinformed people, to be corrected.

But what is the 'misconception'? Is it the particular name? Or the
idea of a family of languages having a name?


>> It's time you thought about it sensibly instead of
>> thinking it's a single language name.
>
>It?s *past* time that you posted something different from this constant
>bickering, something constructive, to this newsgroup, like source code,

Every time you criticise someone for something that's merely a matter
of your opinion I will do my best to ensure that no-one is mislead by
your language.


>or finally FOAD'd.

Vulgar abuse is hardly the most convincing kind of logical argument.


John