The Rigi C++ Parser
This document explains the usage of cs2rsf, the Rigi C++
parser. The Rigi C++ parser is designed to extract entities and
the relationships between entities from files of C++ source code.
Currently the entities that the parser extracts are namespaces,
classes and their members, functions, datatypes, and variables.
The relationships that the parser
extracts are containing relationships for namespaces, inheritance
for classes, function calls between function entities, data
accesses between functions and data structure entities, and variable
references between functions and variables. The
parser also extracts other attributes of the entities, such as
their type (either Function or Data) and the line number in the
source code file that they were extracted from.
cs2rsf reads VisualAge for C++ 4.0 project files and writes the
entities, relationships, and attributes it extracts as RSF
(Rigi Standard Format) 4-tuples to standard output. As a side effect,
the project will be compiled and linked.
- The C++ Domain
- cs2rsf Command Line Parameters
- Usage Examples
The C++ Domain
Node Types
Besides the standard Rigi node types (Unknown and Collapse),
the C++ domain contains the following node types:
- A Function node is emitted for every function in the C++ source
code.
- A Prototype node is emitted for every prototype in the C++ source
code.
- A Variable node is emitted for every non-constant variable
(local, global, or formal parameter) in the C++ source code.
- A Constant node is emitted for every constant variable
(local, global, or formal parameter) in the C++ source code.
- A Datatype node is emitted for every datatype defined in the
C++ source code. This includes predifined data types, enumerations,
and typedefs.
- A Class node is emitted for every class defined in the
C++ source code. This includes classes, structures, and unions.
- A Namespace node is emitted for every namespace defined
in the C++ source code.
Node Names
Node names will be built from the name of an artifact (variable, ...),
the location where it is defined, and a scope number, if the artifact is
defined within a namespace, class, or function:
- A global variable number defined on line 1 of file
myfile.cc will be called number^myfile.cc^1.
- A parameter argc of function main which is defined
on line 2 of file myfile.cc will be called
argc^main^myfile.cc^2^0
Arc Types
Besides the standard Rigi node types (level, composite,
and multiarc), the C++ domain contains the following arc
types. If you read the elements of an arc in the order element 2,
element 1, element 3, then you will have an English sentence that explains
the relation:
- An extends arc is generated for every class that inherits
other classes. For every superclass of that class, an arc will be
created to connect the subclass to the superclass.
- An isInNamespace arc is generated for every artifact that is
contained in a namespace. It connects the artifact to the namespace.
- An isMemberOf arc is generated for every member of a class.
It connects the member to the containing class.
- An isTheSameAs arc is generated for every typedef. It connects
the type that is defined to the original data type.
- An accesses arc is generated if the code of a function
accesses elements within a structure or union (rather than just the
structure or union as a whole). This helps find out which functions
depend on the internal representation of a data type. The arc goes
from the function to the data type accessed.
- A calls arc is generated for every function call. It originates
from a function and points to either a function or a prototype node.
- A returns arc is generated for every prototype and function.
It connects the prototype or function to the data type of its return value.
- A references arc is generated whenever a variable is referenced.
- A hasType arc is generated for every variable and constant node.
It connects the variable or constant to its data type.
- An isDefinedIn arc is emitted for variable, constant, datatype,
and prototype nodes. It connects them to the function they are
defined in.
Attributes
cs2rsf emits the following attributes for nodes:
- tagged has no meaning for Rigi views and should be ignored.
htmlrsf uses it to insert html links and tags into C++ source
files and emits nodeurl attributes instead.
cs2rsf Command Line Parameters
cs2rsf expects the name of the project file for the project to parse
as its only command line parameter. A short usage summary will be displayed
if the cs2rsf is invoked with to few or to many parameters.
Usage Examples
Using cs2rsf to generate Rigiedit views
To generate a Rigi view for a project, first invoke cs2rsf, then run
sortrsf to remove duplicate tuples, combine multiple arcs with the same
source and destination into arcs of type multiarc and sort the
rsf file for faster processing:
cs2rsf myproject.icc | sortrsf -m > myproject.rsf
Now you can load myproject.rsf.sorted into rigiedit
Using cs2rsf to htmlize source and generate Rigiedit views
Use cs2rsf to parse your C++ program and sortrsf with the -4
option to remove duplicate tuples from the RSF file:
cs2rsf myproject.icc | sortrsf -4 > temp.rsf
Use htmlrsf to htmlize the source and create nodeurl
attributes:
htmlrsf -pxa extends,calls,references,tagged -b type < temp.rsf > myproject.rsf
Now you can load myproject.rsf into rigiedit. If you
double-click on a node, netscape will be started to display the source code
for the node.
This manual page is maintained by
Johannes Martin