OpenSees Cloud
OpenSees AMI
How to Profile OpenSees
Original Post - 21 Nov 2022 - Michael H. Scott
Visit Structural Analysis Is Simple on Substack.
Where does the time in a nonlinear finite element analysis go? For large models, solving equations dominates the analysis time, while for small to moderate models, constitutive models (state determination) dominate. It is impossible to know the relative cost of state determination and solving equations prior to an analysis. Profiling tools can tell you after the fact where a program spent its time.
This post shows how to use
gprof
,
a Linux/Unix profiling tool, to find
the distribution of computation time for an OpenSees Tcl analysis.
Profiling a standalone executable is far easier than profiling a shared
object file such as opensees.so
.
Install gprof
The gprof
tool is part of the
binutils
package.
$ sudo apt-get install binutils
Note: I used Ubuntu 20.04 in generating this post.
Re-compile OpenSees in DEBUG Mode
To get a profile out of OpenSees, you need to enable certain compiler
options. So that you don’t have to futz around with command line
arguments, the necessary compiler options are enabled by setting the
DEBUG_MODE
variable in your Makefile.def
configuration file.
DEBUG_MODE = DEBUG
#DEBUG_MODE = NO_DEBUG
Now you need to wipe OpenSees clean using make wipe
in the top level
OpenSees/
directory.
OpenSees/$ make wipe
Then re-build OpenSees. This can take a few minutes, so use multiple processors, e.g., 4, if you got ‘em.
OpenSees/$ make -j 4
Due to sprinkling of profiling bread crumbs throughout the code, the
resulting executable will be larger than normal and analyses will run
slower than usual. So, when you’re done profiling, be sure to change
back to NO_DEBUG
and re-compile after wiping OpenSees clean again. But
don’t do that yet–you still need to profile.
Run an OpenSees Analysis
With OpenSees built in DEBUG
mode, you can run the Tcl script for an
analysis you want to profile.
$ OpenSees model.tcl
Although the analysis will take longer to run compared to NO_DEBUG
,
you’ll see the usual output, etc. However, with debug mode turned on, a
binary file gmon.out
will be created in your working directory. This is
a binary file that gprof can turn into useful information. To this end,
you run gprof with the absolute path to your OpenSees executable (mine
is in ~/bin/
) and re-direct the output to a text file.
$ gprof ~/bin/OpenSees gmon.out > modelProfile.txt
You can redirect to whatever filename you want.
View Profile Results
Now, you can view the text file with profile results to see where the time went. The “Flat profile” at the start of the file gives the most useful information. The flat profile ranks the function calls by % time and also shows cumulative and self time for each function.
For example, in the profile below, 59.18% of the time was spent in
Steel02:setTrialStrain()
, with 15.12% and 11.09% spent in
FiberSection2d::setTrialSectionDeformation()
and
UniaxialMaterial::setTrial()
, respectively.
% cumulative self self total
time seconds seconds calls s/call s/call name
59.18 8.06 8.06 279724800 0.00 0.00 Steel02::setTrialStrain(double, double)
15.12 10.12 2.06 699312 0.00 0.00 FiberSection2d::setTrialSectionDeformation(Vector const&)
11.09 11.63 1.51 626304 0.00 0.00 FiberSection2d::getInitialTangent()
4.48 12.24 0.61 279724800 0.00 0.00 UniaxialMaterial::setTrial(double, double&, double&, double)
More detailed explanations of the flat profile are provided in the text
file generated from gmon.out
. The explanations come after the flat
profile, which can be pretty long. Search the text file for “Copyright”
and you’ll get pretty close to the explanation of terms.
You can find additional information on each function in the subsequent
“Call graph” section. The function with the index number, e.g., [8], has
parent functions listed above and child functions listed below. The self
and children
fields represent the time propagated to and from the
indexed function.
index % time self children called name
2.06 8.78 699312/699312 ForceBeamColumn2d::update() [1]
[8] 79.6 2.06 8.78 699312 FiberSection2d::setTrialSectionDeformation(Vector const&) [8]
0.61 8.17 279724800/279724800 UniaxialMaterial::setTrial(double, double&, double&, double) [9]
0.00 0.00 699312/3471523 Vector::operator=(Vector const&) [37]
0.00 0.00 1398624/1406520 Vector::operator()(int) const [707]
More detailed explanations of the call graph data are provided in the profile text file, after the lengthy call graph list. Search for “Copyright” again and you’ll land near the explanation of call graph terms.
Improve the Code
Now you know where to focus to make your code more efficient. Or you can use the profile information when you post an issue on GitHub.
When you’re done profiling, don’t forget to recompile OpenSees with
NO_DEBUG
so that you get back to a lean, mean executable.
I’m not aware of good tools to profile OpenSeesPy analyses or to profile OpenSees Tcl in a Windows environment. If you know of profiling tools for either of these cases, let everyone know in the Comments section below.