CS520
Fall 2013
Program 3
Due Sunday October 6


Write a C program, called dis_class, that takes a single command line parameter, which is the name of a java class file. The program should then parse the class file, and disassemble all methods found in the classfile.

Command Line Arguments

The program will take argument, the name of the class file to disassemble.
  1. Name of the java class file to disassemble

For example:

./dis_class Test.class
denotes reading the content of Test.class and disassembling the methods found.

Functionality

Each method that is disassembled should be printed out in the following format:
cmo66@zelinka:~/cs520/p3/classfile$ ./dis_class tests/Pathetic.class 
Method Name: <init>
	0 aload_0 
	1 invokespecial (1)
	4 return 

Method Name: main
	0 return 
Each method should have "Method Name: " followed by the method's name as looked up in the constant table.

Each instruction should appear on its own line, preceded by a tab, followed by the byte offset, followed by a space, followed by the offset number. Opcodes that take a single parameter should have the parameter following in parentheses, like invokespecial as seen above. Opcodes that take more than one parameter should have the parameters listed, separated by a space, like iinc as follows.

cmo66@zelinka:~/cs520/p3/classfile$ ./dis_class tests/Iinc.class 
Method Name: <init>
	0 aload_0 
	1 invokespecial (1)
	4 return 

Method Name: main
	0 iconst_0 
	1 istore_1 
	2 iinc (132, 1)
	5 getstatic 
	6 nop 
	7 iconst_m1 
	8 iload_1 
	9 invokevirtual (3)
	12 return 
Note that the various types of jumps also require special care. When the jumps appear in the file, they appear as an offset from the current location, but when they get printed, they should be printed as a final destination, not a delta, so you will need to note what the current byte was for the instruction, and then add the offset to get the final destination byte.
-bash-4.2$ ./dis_class tests/Iftest.class 
Method Name: <init>
	0 aload_0 
	1 invokespecial (1)
	4 return 

Method Name: itm
	0 iload_1 
	1 bipush (100)
	3 if_icmpge (17)
	6 getstatic (2)
	9 ldc (3)
	11 invokevirtual (4)
	14 goto (25)
	17 getstatic (2)
	20 ldc (5)
	22 invokevirtual (4)
	25 return 
The two other non trivial opcodes are tableswitch and lookupswitch. See below for the formatting of those two opcodes. The general format is that the description of the entire table (default, high, low, npairs, whatever is present for the switch) is to be written 1 element per line starting with 2 tabs, and the different choices should be written one per line with 3 tabs. Note that the targets in tables are also offset byte amounts, and will need to be handled as such.
-bash-4.2$ ./dis_class tests/Tableswitch.class 
Method Name: <init>
	0 aload_0 
	1 invokespecial (1)
	4 return 

Method Name: main
	0 bipush (10)
	2 istore_1 
	3 iload_1 
	4 tableswitch 
		default: 32
		low: 0
		high: 2
			0: 32
			1: 32
			2: 32
	32 return 

-bash-4.2$ ./dis_class tests/Lookupswitch.class 
Method Name: 
	0 aload_0 
	1 invokespecial (1)
	4 return 

Method Name: main
	0 bipush (10)
	2 istore_1 
	3 iload_1 
	4 lookupswitch 
		default: 40
		nPairs: 3
			0: 40
			10: 40
			100: 40
	40 return 

Ignore These Opcodes (and extra credit)

I have been having difficulty generating test data for the opcodes 0xc4 (wide) and 0xba (invokedynamic). For this reason, these opcodes will not be part of this assignment, and you can assume that your program is not required to handle them. Any student who wishes to receive additional points on the program must do the following:
  1. Create a java program that produces the relevant opcode (0xc4 or 0xba) when compiled on agate
  2. Make your program handle the selected opcode correctly, as specified above.
  3. Send me an email with the java file prior to the due date
Doing this for either opcode will earn 5 extra points, for a maximum of 10 possible points. Note that this may not be trivial, and may actually be impossible; I couldn't figure out a good way to make either opcode show up.

Reference Solution

I will be providing a binary that you can use to compare your results against. I will be comparing the outputs of your binary and my binary against one another, so make the two perform exactly the same.

It can be copied from ~cs520/prog3_ref which is world readable.

Note that the reference solution may have bugs in it, and if it does, extra points will be awarded to students who find bona fide bugs in the reference solution. The quantity of points to be awarded will be determined at the sole discretion of the instructor, but will be on the order of 0.1-2 per bug, depending on the severity and importance of the bug.

Debugging Output

Remember to comment out or otherwise disable ALL debugging output. Failing to do so will result in losing points.

Return Status

The program should return 0, unless there is some kind of error.

Source Files

I provide a some starter code.

  1. makefile: a Makefile
  2. dis_class.c: Main stub
  3. utils.c: A file with utility functions for checking malloc's return value
  4. utils.h: A header file for utils.c
  5. classfile.c: C functions associated with parsing class files
  6. classfile.h: A header file for parsing class files
  7. disassemble.c: Stubs for the disassemble function
  8. disassemble.h: A header file for disassemble function
These files fill in the structs in classfile.h, which you will then use to disassemble the various methods in the class file. The program comes with a function called print_classfile that prints pretty much all of the information in the classfile struct out to the command line. I would recommend looking at how this function works to understand how to access the information in the classfile.

Like the reference solution, it is possible that this code will not be perfect. Students who are the first to report a bona fide bug will be awarded extra points; the quantity of points awarded once again will be determined at the sole discretion of the instructor, but will be on the order of 0.1-2 per bug, depending on the severity and importance of the bug.

Opcode List and information

When making this assignment I used to wikipedia list of java opcodes.

I copied the table from there, and used that table to map opcodes to strings. I would encourage you to do the same, so that you won't have to type 255 different opcodes and their names, and their argument counts. One useful skill is figuring out how to take data from the internet or some other source and incorporate that information into your code. Students are allowed to discuss what general techniques they use to solve this (i.e. telling your friend you copied the source code from the wikipedia article and edited it in emacs is okay) but, as I am sure we are all aware of at this point, giving your friend whatever you produced is clearly not acceptable.

Building the program

This program is a bit larger than the last ones, so you will be given some flexibility regarding building the program. I provide a starter makefile, which you may modify if you add additional source files. You must submit all of the files necessary to compile and run your program. I expect all files to reside in the same directory, at which point I will execute the command "make". I expect there will be a makefile present that will allow make to properly build your executable with the correct name. If anyone is having problems getting a basic makefile to do something, please let me know via piazza (anonymously if you wish) and I will point to some resources I have used in the past to get makefiles to work.

Testing the program

In order to test this program, you need java class files. As you probably remember from past java classes, these can easily be created using javac or basically any other compiler for the java language. As always, testing is an important part of doing well on this program, and to encourage collaboration on testing, students are allowed to share java files with one another.

Test Plan Documentation

One of the primary purposes of this course is to get you used to doing heavy duty testing on your programs. In support of that goal, when you submit this program, I would also like you to submit a document called testplan.txt. This document should have the following components:
  1. Describe any tools you used or constructed to assist with testing (e.g. scripts, etc).
  2. For every part of the program believe you deserve points for, convince me that the testing you did was adequate to prove that the program reliably does that functionality. For example, describe the test cases you ran, and why those test cases are good exemplars. For example, why do you think your program correctly handles opcodes with no arguments? Did you check the output of all opcodes? Were you able to verify that all of the opcodes have the correct name? How did you do this?
The reason I am asking for this document is to make you think beforehand about what kind of testing you want to do, and also to make sure after the fact that the testing that you did was adequate.

If your program works poorly writing a high quality test plan, even if your program isn't actually able to pass the plan, is a good way to earn points.


Grading

Your program will be graded primarily by testing it for correct functionality. Correct functionality is defined as matching the reference solution.
  1. 10 points will be given based upon the testplan documentation.
  2. 40 points will be awarded for correctly handling opcodes that have no arguments.
  3. 10 additional points will be awarded for correctly handling opcodes that have a 1 character worth of arguments.
  4. 10 additional points will be awarded for correctly handling opcodes that have a single argument constructed from 2 characters (e.g. 0xa7 (goto), 0xb7 (invokespecial)).
  5. 5 additional points will be awarded for correctly handling opcodes that have a single argument constructed from 4 characters (e.g. 0xc8 (goto_w), 0xc9 (jsr_w)).
  6. 5 additional points will be awarded for correctly handling iinc (0x 84) opcode.
  7. 5 additional points will be awarded for correctly handling multinewarray (0x c5) opcode.
  8. 5 additional points will be awarded for correctly handling invokeinterface (0x b9) opcode.
  9. 5 additional points will be awarded for correctly handling the tableswitch (0x aa) opcode.
  10. 5 additional points will be awarded for correctly handling the lookupswitch (0x ab) opcode.

You must do the first 3 in order. 4-9 can be done independently in any order; I have arranged them in what I believe to be order of difficulty.

In addition, remember, you may lose points if your program is not properly structured or adequately documented. Coding guidelines are given on the course overview webpage.

Your programs will be graded using agate.cs.unh.edu so be sure to test in that environment.

Remember: as always you are expected to do your own work on this assignment. Copying code from another student or from sites on the internet is explicitly forbidden!

Submission

Your programs should be submitted for grading from agate.cs.unh.edu. In order to turn in the program, first make sure you are SSH'd in to agate. To turn in this program, type:
% ~cs520/bin/DoSubmission.py prog3 file1 file2 file3 file4

This submission script is new. It passed what testing I have done on it, but it may still have issues. If there are any problems, please contact me via email and I will do my best to assist you. If I cannot be reached, please send me a copy of your assignment via email, and we will deal with the submission script later.

Due Date

This assignment is due Sunday October 6. The standard late policy concerning late submissions will be in effect.