CS520
Fall 2013
Program 3
Due Sunday October 6
Write a C program, called dis_class, that takes a single command line
parameter, which is the name of a java class file. The program should
then parse the class file, and disassemble all methods found in the
classfile.
Command Line Arguments
The program will take argument, the name of the class file to
disassemble.
- Name of the java class file to disassemble
For example:
./dis_class Test.class
denotes reading the content of Test.class and disassembling the
methods found.
Functionality
Each method that is disassembled should be printed out in the
following format:
cmo66@zelinka:~/cs520/p3/classfile$ ./dis_class tests/Pathetic.class
Method Name: <init>
0 aload_0
1 invokespecial (1)
4 return
Method Name: main
0 return
Each method should have "Method Name: " followed by the method's name
as looked up in the constant table.
Each instruction should appear on its own line, preceded by a tab,
followed by the byte offset, followed by a space, followed by the
offset number. Opcodes that take a single parameter should have the
parameter following in parentheses, like invokespecial as seen above.
Opcodes that take more than one parameter should have the parameters
listed, separated by a space, like iinc as follows.
cmo66@zelinka:~/cs520/p3/classfile$ ./dis_class tests/Iinc.class
Method Name: <init>
0 aload_0
1 invokespecial (1)
4 return
Method Name: main
0 iconst_0
1 istore_1
2 iinc (132, 1)
5 getstatic
6 nop
7 iconst_m1
8 iload_1
9 invokevirtual (3)
12 return
Note that the various types of jumps also require special care. When
the jumps appear in the file, they appear as an offset from the
current location, but when they get printed, they should be printed as
a final destination, not a delta, so you will need to note what the
current byte was for the instruction, and then add the offset to get
the final destination byte.
-bash-4.2$ ./dis_class tests/Iftest.class
Method Name: <init>
0 aload_0
1 invokespecial (1)
4 return
Method Name: itm
0 iload_1
1 bipush (100)
3 if_icmpge (17)
6 getstatic (2)
9 ldc (3)
11 invokevirtual (4)
14 goto (25)
17 getstatic (2)
20 ldc (5)
22 invokevirtual (4)
25 return
The two other non trivial opcodes are tableswitch and lookupswitch.
See below for the formatting of those two opcodes. The general format
is that the description of the entire table (default, high, low,
npairs, whatever is present for the switch) is to be written 1 element
per line starting with 2 tabs, and the different choices should be
written one per line with 3 tabs. Note that the targets in tables are
also offset byte amounts, and will need to be handled as such.
-bash-4.2$ ./dis_class tests/Tableswitch.class
Method Name: <init>
0 aload_0
1 invokespecial (1)
4 return
Method Name: main
0 bipush (10)
2 istore_1
3 iload_1
4 tableswitch
default: 32
low: 0
high: 2
0: 32
1: 32
2: 32
32 return
-bash-4.2$ ./dis_class tests/Lookupswitch.class
Method Name:
0 aload_0
1 invokespecial (1)
4 return
Method Name: main
0 bipush (10)
2 istore_1
3 iload_1
4 lookupswitch
default: 40
nPairs: 3
0: 40
10: 40
100: 40
40 return
Ignore These Opcodes (and extra credit)
I have been having difficulty generating test data for the opcodes
0xc4 (wide) and 0xba (invokedynamic).
For this reason, these opcodes will not be part of this assignment,
and you can assume that your program is not required to handle them.
Any student who wishes to receive additional points on the program
must do the following:
- Create a java program that produces the relevant opcode (0xc4 or
0xba) when compiled on agate
- Make your program handle the selected opcode correctly, as
specified above.
- Send me an email with the java file prior to the due date
Doing this for either opcode will earn 5 extra points, for a maximum
of 10 possible points. Note that this may not be trivial, and may
actually be impossible; I couldn't figure out a good way to make
either opcode show up.
Reference Solution
I will be providing a binary that you can use to compare your results
against. I will be comparing the outputs of your binary and my binary
against one another, so make the two perform exactly the same.
It can be copied from ~cs520/prog3_ref which is world readable.
Note that the reference solution may have bugs in it, and if it does,
extra points will be awarded to students who find bona fide bugs in
the reference solution. The quantity of points to be awarded will be
determined at the sole discretion of the instructor, but will be on
the order of 0.1-2 per bug, depending on the severity and importance of
the bug.
Debugging Output
Remember to comment out or otherwise disable ALL debugging output.
Failing to do so will result in losing points.
Return Status
The program should return 0, unless there is some kind of error.
Source Files
I provide a some starter code.
- makefile: a Makefile
- dis_class.c: Main stub
- utils.c: A file with
utility functions for checking malloc's return value
- utils.h: A header file
for utils.c
- classfile.c: C functions
associated with parsing class files
- classfile.h: A header file
for parsing class files
- disassemble.c: Stubs
for the disassemble function
- disassemble.h: A header file
for disassemble function
These files fill in the structs in classfile.h, which you will then
use to disassemble the various methods in the class file.
The program comes with a function called print_classfile that prints
pretty much all of the information in the classfile struct out to the
command line. I would recommend looking at how this function works to
understand how to access the information in the classfile.
Like the reference solution, it is possible that this code will not be
perfect. Students who are the first to report a bona fide bug will be
awarded extra points; the quantity of points awarded once again will
be determined at the sole discretion of the instructor, but will be on
the order of 0.1-2 per bug, depending on the severity and importance of
the bug.
Opcode List and information
When making this assignment I used to wikipedia
list of java opcodes.
I copied the table from there, and used that table to map opcodes to
strings. I would encourage you to do the same, so that you won't have
to type 255 different opcodes and their names, and their argument
counts. One useful skill is figuring out how to take data from the
internet or some other source and incorporate that information into
your code. Students are allowed to discuss what general techniques
they use to solve this (i.e. telling your friend you copied the source
code from the wikipedia article and edited it in emacs is okay) but,
as I am sure we are all aware of at this point, giving your friend
whatever you produced is clearly not acceptable.
Building the program
This program is a bit larger than the last ones, so you will be given
some flexibility regarding building the program. I provide a starter
makefile, which you may modify if you add additional source files.
You must submit all of the files necessary to compile and run your
program. I expect all files to reside in the same directory, at which
point I will execute the command "make". I expect there will be a
makefile present that will allow make to properly build your
executable with the correct name. If anyone is having problems
getting a basic makefile to do something, please let me know via
piazza (anonymously if you wish) and I will point to some resources I
have used in the past to get makefiles to work.
Testing the program
In order to test this program, you need java class files. As you
probably remember from past java classes, these can easily be created
using javac or basically any other compiler for the java language. As
always, testing is an important part of doing well on this program,
and to encourage collaboration on testing, students are allowed to
share java files with one another.
Test Plan Documentation
One of the primary purposes of this course is to get you used to doing
heavy duty testing on your programs. In support of that goal, when
you submit this program, I would also like you to submit a document
called testplan.txt. This document should have the following
components:
- Describe any tools you used or constructed to assist with testing
(e.g. scripts, etc).
- For every part of the program believe you deserve points for,
convince me that the testing you did was adequate to prove that the
program reliably does that functionality. For example, describe the
test cases you ran, and why those test cases are good
exemplars. For example, why do you think your program correctly
handles opcodes with no arguments? Did you check the output of all
opcodes? Were you able to verify that all of the opcodes have the
correct name? How did you do this?
The reason I am asking for this document is to make you think
beforehand about what kind of testing you want to do, and also to make
sure after the fact that the testing that you did was adequate.
If your program works poorly writing a high quality test plan, even if
your program isn't actually able to pass the plan, is a good way to
earn points.
Grading
Your program will be graded primarily by testing it for correct
functionality. Correct functionality is defined as matching the
reference solution.
- 10 points will be given based upon the testplan documentation.
- 40 points will be awarded for correctly handling opcodes that
have no arguments.
- 10 additional points will be awarded for correctly handling
opcodes that have a 1 character worth of arguments.
- 10 additional points will be awarded for correctly handling
opcodes that have a single argument constructed from 2 characters
(e.g. 0xa7 (goto), 0xb7 (invokespecial)).
- 5 additional points will be awarded for correctly handling
opcodes that have a single argument constructed from 4 characters
(e.g. 0xc8 (goto_w), 0xc9 (jsr_w)).
- 5 additional points will be awarded for correctly handling iinc
(0x 84) opcode.
- 5 additional points will be awarded for correctly handling
multinewarray (0x c5) opcode.
- 5 additional points will be awarded for correctly handling
invokeinterface (0x b9) opcode.
- 5 additional points will be awarded for correctly handling the
tableswitch (0x aa) opcode.
- 5 additional points will be awarded for correctly handling the
lookupswitch (0x ab) opcode.
You must do the first 3 in order. 4-9 can be done independently in
any order; I have arranged them in what I believe to be order of
difficulty.
In addition, remember, you may lose points if your program is not properly
structured or adequately documented.
Coding guidelines are given on the
course
overview webpage.
Your programs will be graded using agate.cs.unh.edu
so be sure to test in that environment.
Remember: as always you are expected to do your own work on this assignment.
Copying code from another student or from sites on the internet is
explicitly forbidden!
Submission
Your programs should be submitted for grading from
agate.cs.unh.edu. In order to turn in the program,
first make sure you are SSH'd in to agate. To turn in this program,
type:
% ~cs520/bin/DoSubmission.py prog3 file1 file2 file3 file4
This submission script is new. It passed what testing I have done on
it, but it may still have issues. If there are any problems, please
contact me via email and I will do my best to assist you. If I cannot
be reached, please send me a copy of your assignment via email, and we
will deal with the submission script later.
Due Date
This assignment is due Sunday October 6. The standard late policy
concerning late submissions will be in effect.