User:Joe Geyer
From Gannodss
Contents |
[edit]
Joe Geyer
- Miami University, Masters in Computer Science, 2008 - present
- Illinois State University, M.S. Math Education
- Illinois State University, B.S. Math Education
[edit]
Reading
- Asikainen, T., Mannisto, T., & Soininen, T. (2006). A unified conceptual foundation for feature modelling. Proc. 10th International Software Product Line Conference, 31-40. PDF
- "Recovering Concepts from Source Code with Automated Concept Identification", Maurice M. Carey and Gerald C. Gannod, Proceedings of the 15th IEEE International Conference on Program Comprehension, June 2007 PDF
- Chen, K., Zhang, W., Zhao, H., & Mei, H. (2005). An approach to constructing feature models based on requirements clustering. Proc. 13th IEEE International Conference on Requirements Engineering, 31-40.
- Czarnecki, K., Hwan, C., Kim, P., & Kalleberg, K. T. (2006). Feature models are views on ontologies. Proc. 10th International Software Product Line Conference, 41-51. PDF
- K. Czarnecki, S. Helsen, and U. Eisenecker. Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models. Software Process Improvement and Practice, special issue on "Software Variability: Process and Management, 10(2), 2005, pp. 143 - 169 PDF
- Czarnecki, K., & Wasowski, A. (2007). Feature diagrams and logics: There and back again. Proc. 11th International Software Product Line Conference SPLC 2007, 23-34. PDF
- Czarnecki, K. FeaturePlugin: Feature modeling plug-in for eclipse(2004).
- K. Czarnecki. Overview of Generative Software Development. In J.-P. Banâtre et al. (Eds.): Unconventional Programming Paradigms (UPP) 2004, Mont Saint-Michel, France, , LNCS 3566, pp. 313–328, 2005
- Gannod, G. C., Timm, John T. E., & Brodie, R., J. (2006). Facilitating the specification of semantic web services using model-driven development. International Journal of Web Services Research, 3(3), 61-81. PDF
- Gannod, G. C., & Timm, John T. E. (2004). An MDA-based approach for facilitating adoption of semantic web service technology. Paper presented at the Proceedings of the 8th IEEE EDOC Enterprise Computing Conference Workshop on Model-Driven Semantic Web, Monterey, California. PDF
- Poshyvanyk, D., Poshyvanyk, D., & Marcus, A. (2007). Combining formal concept analysis with information retrieval for. Proc. 15th IEEE International Conference on Program Comprehension, 37-48. PDF
- Sun, J., Zhang, H., Fang, Y., & Wang, L. H. (2005). Formal semantics and verification for feature modeling. Proc. 10th IEEE International Conference on Engineering of Complex, 303-312. PDF
- Timm, John T. E., & Gannod, G. C. (2005). A model-driven approach for specifying semantic web services. Paper presented at the Proceedings of the 3rd IEEE International Conference on Web Services (ICWS 2005), Orlando, FL. PDF
- Ontology Learning Papers
- Text2Onto.pdf is a paper that briefly describes the text2onto tool. This is the ontology learning tool that I have been using so far. It is a plugin that is being used in the NeON ontology engineering environment.
[edit]
Interests
- Reverse Software Engineering
- Data Mining with RapidMiner
- Feature Modeling Feature Model Literature
- Feature modeling plugin developed at the University of Waterloo's Generative Software Development Lab.
[edit]
Research Information
| System Name | Domain | Number of Classes | URL |
|---|---|---|---|
| Alfreso | CMS | 2029 | www.alfresco.com |
| ANTLR | Lexer and Parser Generator | 1241 | www.antlr.org |
| Apache Ant | Build Tool | tbd | ant.apache.org |
| Apache Axis2 | Web Services | tbd | ws.apache.org/axis2 |
| Apache Derby | Database | tbd | db.apache.org/derby |
| Apache Roller | Weblog Server | tbd | roller.apache.org |
| Apache Xalan | XSLT Processor | tbd | xalan.apache.org |
| Apache Xerces | XML Parser | tbd | xerces.apache.org |
| ArgoUML | UML Tool | 1711 | argouml.tigris.org |
| Azureus (vuze) | BitoTorrent Client | tbd | www.vuze.com |
| Checkstyle | Code Analyzer | tbd | checkstyle.sourceforge.net |
| Eclipse | IDE | tbd | www.eclipse.org/downloads |
| Jakarta Tomcat | Webserver | tbd | jakarta.apache.org |
| Jake | 3-D Game | tbd | bytonic.de/html/jake2.html |
| JBoss | JEE Application Server | tbd | www.jboss.org |
| Edit | Text Editor | tbd | unkown |
| JHotDraw | Graphics | tbd | www.hotdraw.org |
| Hibernate | ORM Framework | tbd | www.hibernate.org |
| IzPack | Installer Generator | tbd | izpack.org |
| OpenProj | Project Management | tbd | openproj.org |
| Panda | Proof Assistan | tbd | ASU |
| Phex | P2P File Sharing | tbd | www.phex.org |
| PMD | Code Analyzer | tbd | pmd.sourceforge.net |
| RapidMiner | Data Mining | tbd | rapid-i.com |
| Scarab | Artifact Tracking System | tbd | scarab.tigris.org |
| Spark | Jabber Client | tbd | www.igniterealtime.org/projects/spark |
| Spring | Application Framework | tbd | www.springframework.org |
| Spring IDE | IDE Extension | tbd | www.springsource.org/springide/release-20 |
[edit]
Status Report
- 1/29 - 2/4
- Create a demonstration of the feature modeling plugin.
- Read
- Sun, J., Zhang, H., Fang, Y., & Wang, L. H. (2005). Formal semantics and verification for feature modeling. Proc. 10th IEEE International Conference on Engineering of Complex, 303-312. PDF
- Asikainen, T., Mannisto, T., & Soininen, T. (2006). A unified conceptual foundation for feature modelling. Proc. 10th International Software Product Line Conference, 31-40. PDF
- K. Czarnecki. Overview of Generative Software Development. In J.-P. Banâtre et al. (Eds.): Unconventional Programming Paradigms (UPP) 2004, Mont Saint-Michel, France, , LNCS 3566, pp. 313–328, 2005
- Use StarUML with textbooks to continue learning UML
- 2/5 - 2/11
- Create demonstration of Dell computer configuration with feature modeling.
- Possible papers for presentation.
- K. Czarnecki, S. Helsen, and U. Eisenecker. Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models. Software Process Improvement and Practice, special issue on "Software Variability: Process and Management, 10(2), 2005, pp. 143 - 169 PDF
- Chen, K., Chen, K., Zhang, W., Zhao, H., & Mei, H. (2005). An approach to constructing feature models based on requirements clustering. Proc. 13th IEEE International Conference on Requirements Engineering, 31-40. PDF
- Czarnecki, K., Hwan, C., Kim, P., & Kalleberg, K. T. (2006). Feature models are views on ontologies. Proc. 10th International Software Product Line Conference, 41-51. PDF
- 2/12 - 2/18
- Learn more about product line development with textbooks.
- Develop more models with the feature model plugin.
- 2/19 - 2/25
- Prepare rough draft for presentation of "Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models"
- K. Czarnecki, S. Helsen, and U. Eisenecker. Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models. Software Process Improvement and Practice, special issue on "Software Variability: Process and Management, 10(2), 2005, pp. 143 - 169 PDF
- Develop some examples of the concepts from the article with the feature model plugin.
- Read about Software Product Line Engineering to supplement knowledge with the article.
- Prepare rough draft for presentation of "Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models"
- 2/26 - 3/3
- Finish presentation for the article "Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models". PDF
- Develop some examples of the concepts from the article with the feature model plugin.
- Installation Instructionsfor the feature model plugin.
- Read about Software Product Line Engineering to supplement knowledge with the article.
- Find out how to convert Feature Models to UML class diagrams.
- Begin to learn XML.
- 3/4 - 3/10
- Read
- J. Burge, D.C. Brown, "Software Engineering Using RATionale", Journal of Systems and Software, 81(3): 395-413
- J. Burge, D.C. Brown, "SEURAT: Integrated Rationale Management" , to appear in the Proceedings of the 30th International Conference on Software Engineering (ICSE), Formal Research Demonstrations track, Leipzig, Germany, 10 - 18 May 2008
- Investigate Mapping from Feature Models to UML with fmp2rsm plugin.
- Finish article presentation.
- Read
- 3/25 - 3/31
- Temporary Course Plan for Graduate School
- Breadth Courses
- Advanced Networks - 617
- Software Engineering - 621
- Mathematical Modeling - 615(or Machine Learning - 627 ?)
- Introduction to Artificial Intelligence - 586
- Advanced Database Systems - 585
- Electives
- Web Services and SOA - 570
- Bioinformatics - 570
- 600 level
- Network Security - 620 (Possibly instead of Mathematical Modeling)
- Breadth Courses
- Read
- Gannod, G. C., Timm, John T. E., & Brodie, R., J. (2006). Facilitating the specification of semantic web services using model-driven development. International Journal of Web Services Research, 3(3), 61-81. PDF
- Timm, John T. E., & Gannod, G. C. (2005). A model-driven approach for specifying semantic web services. Paper presented at the Proceedings of the 3rd IEEE International Conference on Web Services (ICWS 2005), Orlando, FL. PDF
- Temporary Course Plan for Graduate School
- 4/1 - 4/7
- Read
- Gannod, G. C., & Timm, John T. E. (2004). An MDA-based approach for facilitating adoption of semantic web service technology. Paper presented at the Proceedings of the 8th IEEE EDOC Enterprise Computing Conference Workshop on Model-Driven Semantic Web, Monterey, California. PDF
- Gain some experience with Web Services to better understand articles
- Watched Podcasts from Web Services and SOA 470
- Read
- 4/8 - 4/14
- Feature Models and Rationale
- Chen; Wei Zhang; Haiyan Zhao; Hong Mei, "An approach to constructing feature models based on requirements clustering," Requirements Engineering, 2005. Proceedings. 13th IEEE International Conference on , vol., no., pp. 31-40, 29 Aug.-2 Sept. 2005
- This article discusses a method for creating feature models through requirement clustering. I think we are looking for a way to use feature models tied together with rationale to create UML and then code. The reason I include this article is because it does include the idea of creating feature models based off of requirements. This reminded me of connecting rationale to code with Dr. Burge's work.
- Czarnecki, K; Antkiewicz, M. (2005). Mapping features to models: A template approach based on superimposed variants. In proceedings of International Conference Generative Programming and component Engineerigin (GPCE '05), vol. 3676 of LNCS, 422-437, Springer
- This article discusses the role presence conditions (PCs) and meta-expressions (MEs) play in the mapping of features to models. PCs indicate which features should be included in the model and which should not. They have built a tool, fmp2rsm, that is an Eclipse plug-in that works with the IBMs Rational Software Modeler and the Feature Modeling plug-in. It is able to work with PC constraints. A demonstration of the fmp2rsm is here and a demonstration of verifying model templates is here.
- Czarnecki, K.; Pietroszek. (2006). Verifying feature-based model templates against well-formedness OCL constraints. In Proceedings of International Conferences on Generative Programming and Component Engineering (GPCE'06), ACM Press
- Well-formedness constraints can be expressed in the Object-Constraint Language (OCL). "The semantics maps OCL constraints to propositional formulas, which are then fed into a SAT solver".
- Chen; Wei Zhang; Haiyan Zhao; Hong Mei, "An approach to constructing feature models based on requirements clustering," Requirements Engineering, 2005. Proceedings. 13th IEEE International Conference on , vol., no., pp. 31-40, 29 Aug.-2 Sept. 2005
- Feature Models and Rationale
- 4/15 - 4/21
- Feature model plugin is installed on my lab computer in the graduate office.
- Key points from last meeting
- Rationale with feature models as Prescriptive v. Descriptive
- Feature modeling or product line approach is beneficial to rationale because it gives ways of reasoning about all of the alternatives.
- Feature modeling offers post-hoc reasoning about why the structure/architecture is the way it is.
- How do we capture rationale with feature models?
- How do we recover rationale?
- How do we evaluate decisions with rationale? Difference in designer opinion
- 4/22 - 4/28
- 5/20 - 5/26
- Source Code for Feature Model Plugin
- 6/23 - 6/29
- Zip file that contains the feature model plugin and directions for installation. download file
- 7/25 - 7/31
- Reading
- Dietterich, T. G. (2003). Machine Learning. In Nature Encyclopedia of Cognitive Science, London: Macmillan, 2003.
- This article gives an intorductory overview of machine learning. It contains sections on supervised learning and unsupervised learning. After presenting important terminology (i.e. classifier, generalization, training set, etc.) it gives an example of learning decision trees. Within this example, the errors of overfitting and underfitting are discussed. Several examples are also given for unsupervised learning. This article does not include a discussion on SVMs.
- Chs. 18, 20 from Russell, S. (2003). Artificial intelligence: A modern approach. New Jersey. Pearson.
- These chapters introduce some of the concepts with learning in artificial intelligence. This text does include a short introduction on SVMs also.
- Rapid Miner Tutorial
- Rapid Miner is a learning environment. I am interested in how a training set can be used to create a learning model using a learning algorithm. I dowloaded Rapid Miner on my computer in the lab and worked through the online tutorial also. This tutorial gave me exposure to how experiments are set up and how to view the results.
- I borrowed a linear algebra textbook and reviewed some of the sections on linear transformations, kernels, hyperplanes, etc. These are topics or words that seemed to pop up in my reading on SVMs from other sources.
- Dietterich, T. G. (2003). Machine Learning. In Nature Encyclopedia of Cognitive Science, London: Macmillan, 2003.
- Goals
- Develop a small experiment to run in Rapid Miner that uses Decision Trees and SVMs and compare results.
- Continue to learn about more types of Machine Learning techniques. (Naive Bayes)
- Find and work through a SVM tutorial.
- Reading
- 8/7 - 8/14
- Wrote a RapidMiner Tutorial using three datasets that were taken from the UCI Machine Learning Repository.
- Reading
- "Recovering Concepts from Source Code with Automated Concept Identification", Maurice M. Carey and Gerald C. Gannod, Proceedings of the 15th IEEE International Conference on Program Comprehension, June 2007 PDF
- 8/29 - 9/5
- Prepared presentations for Software Engineering Reading Group
- J. Bowring, J. Rehg, and M.J. Harrold, "Active Learning for Automatic Classification of Software Behavior", in proceedings of ISSTA '04, 2004.
- 9/6 - 9/12
- Downloaded SVM Classification plugin for eclipse.
- Imported software systems into eclipse for classification.
- Systems imported into eclipse
- RapidMiner
- Jake2
- Systems imported into eclipse
- Began classification of concept classes
- I need more training on how to id concept v. non-concept classes.
- 10/24 - 10/29
- Developed a java program to scan a text document
- puts the words in a sorted array list.
- does not save a word more than once
- We will use these words as keywords to match the names of java classes.
- Current issues
- I need to find a way to extract all of the class names from a software system. I can export all the .java files, but I just need a list of the names, not the files.
- Developed a java program to scan a text document
- 11/7 - 11/14
- Compared class names and keywords with a Java program.
- 11/14 - 11/21
- ArgoUML System
- Implement root word filters
- Implement vowel filter
- Informally test the keyword matching
- ArgoUML System
- 11/21 - 11/28
- Post Systems to be classified on wiki
- Export ArgoUML classes to an Excel spreadsheet.
- Research Meeting Discussion
- What research has been conducted on concept extraction from a document?
- Possibly look at Scarab after ArgoUML
- Need to hand label the ArgoUML Excel Spreadsheet
- 1/20/2009 - 1/27/2009
- Initial results from training set classifier
- Total Classes = 1711
- RootWordFilter( total Classes) = 1176
- Oracle:
- Actual Concept Classes = 126
- Actual Non-concept classes = 1585
- Keywords:
- From PDF = 999 (used for positive ID of concept classes)
- From Java = 97 (used for negative ID of concept classes)
- Mapping : (uses root word filter on classes)
- With vowel filter
- 396 concept classes (5 characters)
- 86 non-concept classes (3 characters)
- 355 after subtracting non-concept matches
- Without vowel filter
- 684 concept classes (5 characters)
- 257 non-concept classes (3 characters)
- 524 after subtracting non concept matches ({concept} – {non-concept} = {new concept})
- With vowel and prefix filter (“action”, “association”)
- 320 concept classes (5 characters)
- 85 non-concept classes (3 characters)
- 279 after subtracting non-concept matches
- With vowel filter
- Results compared to Oracle:
- With vowel filter
- 355 predicted
- 126 actual
- 3 matches ???? clssdgrm, dplymntdgrm, sqncdgrmWithout
- Without vowel filter
- 524 predicted
- 126 actual
- 37 matches -> 29% of actual, 7% of predicted
- With vowel filter and prefix filter (“action”, “association”)
- 279 predicted
- 126 actual
- 3 matches ????
- With vowel filter
- Initial results from training set classifier
[edit]
Thesis Proposal
- March 31, 2009
- Thesis Proposal version 1
- April 6, 2009
- Thesis Proposal version 2
- Includes an extra diagram to show the filtering process.
- Thesis Proposal version 2
- April 21, 2009
- Thesis Proposal version 3
- More developed section on Machine Learning
- More information put in the introduction.
- Thesis Proposal version 3
- April 22, 2009
- Thesis Proposal version 4
- Includes updated sections to include text-to-ontology stage
- Diagrams updated to include the ontology element of the proposal.
- Thesis Proposal version 4
- April 23, 2009
- Thesis Proposal version 5
- Includes section on Sartipi's work. Reverse Software Engineering through data mining
- Include tool evaluation method using Gueheneuc's framework.
- Thesis Proposal version 5
- April 28, 2009
- Mini-Proposal Presentation PDF
- May 05, 2009
- Thesis Proposal version 6
- Reworked sections 1, 2, and 3.1
- New System Diagram showing Active Learning
- Thesis Proposal version 6
- May 06, 2009
- Thesis Proposal version 7
- Formally described filtering and active learning processes
- Added a conclusion
- Thesis Proposal version 7
- May 09, 2009
- Thesis Proposal version 8
- Added set diagrams to supplement the section on filtering and active learning
- Thesis Proposal version 8
- May 12, 2009
- Thesis Proposal version 9
- May 14, 2009
- Thesis Proposal version 10
- Uses peripheral rather than non-domain
- Includes comments on the accuracy achieved in preliminary experiments
- Include paragraph on contributions of the research
- Includes other corrections from previous draft
- Thesis Proposal version 10
- September 1, 2009
- Thesis Proposal version 11
- Includes corrections from Dr. Gannod
- References for the difficulty of creating a training set
- Definition of View
- Ontology tools cited
- Maurice's tool cited
- Table of diagram package classes
- Still need to work on the following
- Thesis statement
- Loop back arrow in diagram?
- Conclusion comment "We will develop tools to support this methodology and perform ..."
- Includes corrections from Dr. Gannod
- Thesis Proposal version 11
[edit]
Suggestions for Improving Paper for ICPC
| Problem | Correction | Completed |
|---|---|---|
| Section 1 wanders around covering a wide range of topics. | Revise so that it focuses on specific (open) challenges the paper is addressing and the novel techniques applied to resolve this challenge. | no |
| Section 2: abstract discussion of machine learning difficult to see how it applies to the problem of design recovery. | Make the relationship between machine learning and design recovery more clear. | no |
| Section 2: Related work is presented too early in the paper. Difficult to compare related work when our method has not yet been described. | Put related work after the approach has been presented in detail. "Background" can be integrated with the introduction section to provide context of the paper. | no |
| Section 3: Investigate "learneris" and "domcain" | Fix spelling errors | no |
| Section 3: High level discussion makes it difficult to assess whether the approach is realistic in practice. | Use a case study example to use throughout the paper. | no |
| Section 3: Long and meandering | Tighten up the structure by describing 4-5 key design challenges associated with automated design recovery and then explain how our approach resolves these challenges more effectively than alternatives. | no |
| Section 4: Unclear whter the results from the case study are representative/significant. | Provide a "threats to validity" discussion. | no |
| Section 5: What were the pros and cons of developing and applying our approach. | Provide a "lessons learned" summary | no |
| Article is too repetitive. | Rework. | no |
| An argument is not provided on why it is even possible that classification based on 14 (non-specified) object oriented metrics. | List the 14 object oriented metrics and discuss why they might be good metrics for this classification problem. | no |
| Background material is unnecessary. | Background material should be focused to the concrete methods used in the work described. | no |
| Why use SVMs? What is it's role in this work? | Provide argument or discussion why SVMs were chosen over other learning algorithms. | no |
| What is a parameter vector theta? | Tighten up discussion on Machine learning. | no |
| Related work only has one reference and it is far from the approach of the paper. | ??? | no |
| Substring matching method is not described. It may lack the consistency needed? | Describe in detail what is happening with the substring matching. Use linguistic methods? | no |
| The phrase "the approach uses active learning to build an optimal training set" is too strong | Use the word "better" instead of "optimal" | no |
| Results are not impressive for PANDA and get worse for ArgoUML | Improve discussion and argument on why the results are acceptable for this task. | no |
| Unclear why results got worse from Panda to ArgoUML. This is important because ArgoUML is a more complex system then Panda and the method should scale. | Provide experimental statistics to help determine the relevance of these results. | no |
| Concept assignment problem introduced at the beginning is not addressed in this paper. | Improve discussion on why the concept assignment problem is used to present a variation of itself. | no |
| Experiments are do not show what understandability was gained. | Provide more realistic experiments that include psychological evaluations of understandability gained when using these tools in large, complex systems. | no |
