User:Joe Geyer

From Gannodss

Jump to: navigation, search

Contents

Joe Geyer

  • Miami University, Masters in Computer Science, 2008 - present
  • Illinois State University, M.S. Math Education
  • Illinois State University, B.S. Math Education

Reading

  • Asikainen, T., Mannisto, T., & Soininen, T. (2006). A unified conceptual foundation for feature modelling. Proc. 10th International Software Product Line Conference, 31-40. PDF
  • "Recovering Concepts from Source Code with Automated Concept Identification", Maurice M. Carey and Gerald C. Gannod, Proceedings of the 15th IEEE International Conference on Program Comprehension, June 2007 PDF
  • Chen, K., Zhang, W., Zhao, H., & Mei, H. (2005). An approach to constructing feature models based on requirements clustering. Proc. 13th IEEE International Conference on Requirements Engineering, 31-40.
  • Czarnecki, K., Hwan, C., Kim, P., & Kalleberg, K. T. (2006). Feature models are views on ontologies. Proc. 10th International Software Product Line Conference, 41-51. PDF
  • K. Czarnecki, S. Helsen, and U. Eisenecker. Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models. Software Process Improvement and Practice, special issue on "Software Variability: Process and Management, 10(2), 2005, pp. 143 - 169 PDF
  • Czarnecki, K., & Wasowski, A. (2007). Feature diagrams and logics: There and back again. Proc. 11th International Software Product Line Conference SPLC 2007, 23-34. PDF
  • Czarnecki, K. FeaturePlugin: Feature modeling plug-in for eclipse(2004).
  • K. Czarnecki. Overview of Generative Software Development. In J.-P. Banâtre et al. (Eds.): Unconventional Programming Paradigms (UPP) 2004, Mont Saint-Michel, France, , LNCS 3566, pp. 313–328, 2005
  • Gannod, G. C., Timm, John T. E., & Brodie, R., J. (2006). Facilitating the specification of semantic web services using model-driven development. International Journal of Web Services Research, 3(3), 61-81. PDF
  • Gannod, G. C., & Timm, John T. E. (2004). An MDA-based approach for facilitating adoption of semantic web service technology. Paper presented at the Proceedings of the 8th IEEE EDOC Enterprise Computing Conference Workshop on Model-Driven Semantic Web, Monterey, California. PDF
  • Poshyvanyk, D., Poshyvanyk, D., & Marcus, A. (2007). Combining formal concept analysis with information retrieval for. Proc. 15th IEEE International Conference on Program Comprehension, 37-48. PDF
  • Sun, J., Zhang, H., Fang, Y., & Wang, L. H. (2005). Formal semantics and verification for feature modeling. Proc. 10th IEEE International Conference on Engineering of Complex, 303-312. PDF
  • Timm, John T. E., & Gannod, G. C. (2005). A model-driven approach for specifying semantic web services. Paper presented at the Proceedings of the 3rd IEEE International Conference on Web Services (ICWS 2005), Orlando, FL. PDF
  • Ontology Learning Papers
    • Text2Onto.pdf is a paper that briefly describes the text2onto tool. This is the ontology learning tool that I have been using so far. It is a plugin that is being used in the NeON ontology engineering environment.

Interests

Research Information

System Name Domain Number of Classes URL
Alfreso CMS 2029 www.alfresco.com
ANTLR Lexer and Parser Generator 1241 www.antlr.org
Apache Ant Build Tool tbd ant.apache.org
Apache Axis2 Web Services tbd ws.apache.org/axis2
Apache Derby Database tbd db.apache.org/derby
Apache Roller Weblog Server tbd roller.apache.org
Apache Xalan XSLT Processor tbd xalan.apache.org
Apache Xerces XML Parser tbd xerces.apache.org
ArgoUML UML Tool 1711 argouml.tigris.org
Azureus (vuze) BitoTorrent Client tbd www.vuze.com
Checkstyle Code Analyzer tbd checkstyle.sourceforge.net
Eclipse IDE tbd www.eclipse.org/downloads
Jakarta Tomcat Webserver tbd jakarta.apache.org
Jake 3-D Game tbd bytonic.de/html/jake2.html
JBoss JEE Application Server tbd www.jboss.org
Edit Text Editor tbd unkown
JHotDraw Graphics tbd www.hotdraw.org
Hibernate ORM Framework tbd www.hibernate.org
IzPack Installer Generator tbd izpack.org
OpenProj Project Management tbd openproj.org
Panda Proof Assistan tbd ASU
Phex P2P File Sharing tbd www.phex.org
PMD Code Analyzer tbd pmd.sourceforge.net
RapidMiner Data Mining tbd rapid-i.com
Scarab Artifact Tracking System tbd scarab.tigris.org
Spark Jabber Client tbd www.igniterealtime.org/projects/spark
Spring Application Framework tbd www.springframework.org
Spring IDE IDE Extension tbd www.springsource.org/springide/release-20

Status Report

  • 1/29 - 2/4
    • Create a demonstration of the feature modeling plugin.
    • Read
      • Sun, J., Zhang, H., Fang, Y., & Wang, L. H. (2005). Formal semantics and verification for feature modeling. Proc. 10th IEEE International Conference on Engineering of Complex, 303-312. PDF
      • Asikainen, T., Mannisto, T., & Soininen, T. (2006). A unified conceptual foundation for feature modelling. Proc. 10th International Software Product Line Conference, 31-40. PDF
      • K. Czarnecki. Overview of Generative Software Development. In J.-P. Banâtre et al. (Eds.): Unconventional Programming Paradigms (UPP) 2004, Mont Saint-Michel, France, , LNCS 3566, pp. 313–328, 2005
    • Use StarUML with textbooks to continue learning UML
  • 2/5 - 2/11
    • Create demonstration of Dell computer configuration with feature modeling.
    • Possible papers for presentation.
      • K. Czarnecki, S. Helsen, and U. Eisenecker. Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models. Software Process Improvement and Practice, special issue on "Software Variability: Process and Management, 10(2), 2005, pp. 143 - 169 PDF
      • Chen, K., Chen, K., Zhang, W., Zhao, H., & Mei, H. (2005). An approach to constructing feature models based on requirements clustering. Proc. 13th IEEE International Conference on Requirements Engineering, 31-40. PDF
      • Czarnecki, K., Hwan, C., Kim, P., & Kalleberg, K. T. (2006). Feature models are views on ontologies. Proc. 10th International Software Product Line Conference, 41-51. PDF
  • 2/12 - 2/18
    • Learn more about product line development with textbooks.
    • Develop more models with the feature model plugin.
  • 2/19 - 2/25
    • Prepare rough draft for presentation of "Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models"
      • K. Czarnecki, S. Helsen, and U. Eisenecker. Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models. Software Process Improvement and Practice, special issue on "Software Variability: Process and Management, 10(2), 2005, pp. 143 - 169 PDF
    • Develop some examples of the concepts from the article with the feature model plugin.
    • Read about Software Product Line Engineering to supplement knowledge with the article.
  • 2/26 - 3/3
    • Finish presentation for the article "Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models". PDF
    • Develop some examples of the concepts from the article with the feature model plugin.
    • Read about Software Product Line Engineering to supplement knowledge with the article.
    • Find out how to convert Feature Models to UML class diagrams.
    • Begin to learn XML.
  • 3/4 - 3/10
    • Read
      • J. Burge, D.C. Brown, "Software Engineering Using RATionale", Journal of Systems and Software, 81(3): 395-413
      • J. Burge, D.C. Brown, "SEURAT: Integrated Rationale Management" , to appear in the Proceedings of the 30th International Conference on Software Engineering (ICSE), Formal Research Demonstrations track, Leipzig, Germany, 10 - 18 May 2008
    • Investigate Mapping from Feature Models to UML with fmp2rsm plugin.
    • Finish article presentation.
  • 3/25 - 3/31
    • Temporary Course Plan for Graduate School
      • Breadth Courses
        • Advanced Networks - 617
        • Software Engineering - 621
        • Mathematical Modeling - 615(or Machine Learning - 627 ?)
        • Introduction to Artificial Intelligence - 586
        • Advanced Database Systems - 585
      • Electives
        • Web Services and SOA - 570
        • Bioinformatics - 570
      • 600 level
        • Network Security - 620 (Possibly instead of Mathematical Modeling)
    • Read
      • Gannod, G. C., Timm, John T. E., & Brodie, R., J. (2006). Facilitating the specification of semantic web services using model-driven development. International Journal of Web Services Research, 3(3), 61-81. PDF
      • Timm, John T. E., & Gannod, G. C. (2005). A model-driven approach for specifying semantic web services. Paper presented at the Proceedings of the 3rd IEEE International Conference on Web Services (ICWS 2005), Orlando, FL. PDF
  • 4/1 - 4/7
    • Read
      • Gannod, G. C., & Timm, John T. E. (2004). An MDA-based approach for facilitating adoption of semantic web service technology. Paper presented at the Proceedings of the 8th IEEE EDOC Enterprise Computing Conference Workshop on Model-Driven Semantic Web, Monterey, California. PDF
    • Gain some experience with Web Services to better understand articles
      • Watched Podcasts from Web Services and SOA 470
  • 4/15 - 4/21
    • Feature model plugin is installed on my lab computer in the graduate office.
    • Key points from last meeting
      • Rationale with feature models as Prescriptive v. Descriptive
      • Feature modeling or product line approach is beneficial to rationale because it gives ways of reasoning about all of the alternatives.
      • Feature modeling offers post-hoc reasoning about why the structure/architecture is the way it is.
      • How do we capture rationale with feature models?
      • How do we recover rationale?
      • How do we evaluate decisions with rationale? Difference in designer opinion


  • 6/23 - 6/29
    • Zip file that contains the feature model plugin and directions for installation. download file
  • 7/25 - 7/31
    • Reading
      • Dietterich, T. G. (2003). Machine Learning. In Nature Encyclopedia of Cognitive Science, London: Macmillan, 2003.
        • This article gives an intorductory overview of machine learning. It contains sections on supervised learning and unsupervised learning. After presenting important terminology (i.e. classifier, generalization, training set, etc.) it gives an example of learning decision trees. Within this example, the errors of overfitting and underfitting are discussed. Several examples are also given for unsupervised learning. This article does not include a discussion on SVMs.
      • Chs. 18, 20 from Russell, S. (2003). Artificial intelligence: A modern approach. New Jersey. Pearson.
        • These chapters introduce some of the concepts with learning in artificial intelligence. This text does include a short introduction on SVMs also.
      • Rapid Miner Tutorial
        • Rapid Miner is a learning environment. I am interested in how a training set can be used to create a learning model using a learning algorithm. I dowloaded Rapid Miner on my computer in the lab and worked through the online tutorial also. This tutorial gave me exposure to how experiments are set up and how to view the results.
      • I borrowed a linear algebra textbook and reviewed some of the sections on linear transformations, kernels, hyperplanes, etc. These are topics or words that seemed to pop up in my reading on SVMs from other sources.
    • Goals
      • Develop a small experiment to run in Rapid Miner that uses Decision Trees and SVMs and compare results.
      • Continue to learn about more types of Machine Learning techniques. (Naive Bayes)
      • Find and work through a SVM tutorial.
  • 8/7 - 8/14
    • Wrote a RapidMiner Tutorial using three datasets that were taken from the UCI Machine Learning Repository.
    • Reading
      • "Recovering Concepts from Source Code with Automated Concept Identification", Maurice M. Carey and Gerald C. Gannod, Proceedings of the 15th IEEE International Conference on Program Comprehension, June 2007 PDF
  • 8/29 - 9/5
    • Prepared presentations for Software Engineering Reading Group
    • J. Bowring, J. Rehg, and M.J. Harrold, "Active Learning for Automatic Classification of Software Behavior", in proceedings of ISSTA '04, 2004.
  • 9/6 - 9/12
    • Downloaded SVM Classification plugin for eclipse.
    • Imported software systems into eclipse for classification.
      • Systems imported into eclipse
        • RapidMiner
        • Jake2
    • Began classification of concept classes
    • I need more training on how to id concept v. non-concept classes.
  • 10/24 - 10/29
    • Developed a java program to scan a text document
      • puts the words in a sorted array list.
      • does not save a word more than once
      • We will use these words as keywords to match the names of java classes.
    • Current issues
      • I need to find a way to extract all of the class names from a software system. I can export all the .java files, but I just need a list of the names, not the files.
  • 11/7 - 11/14
    • Compared class names and keywords with a Java program.
  • 11/14 - 11/21
    • ArgoUML System
      • Implement root word filters
      • Implement vowel filter
      • Informally test the keyword matching
  • 11/21 - 11/28
    • Post Systems to be classified on wiki
    • Export ArgoUML classes to an Excel spreadsheet.
    • Research Meeting Discussion
      • What research has been conducted on concept extraction from a document?
      • Possibly look at Scarab after ArgoUML
      • Need to hand label the ArgoUML Excel Spreadsheet
  • 1/20/2009 - 1/27/2009
    • Initial results from training set classifier
      • Total Classes = 1711
      • RootWordFilter( total Classes) = 1176
      • Oracle:
        • Actual Concept Classes = 126
        • Actual Non-concept classes = 1585
      • Keywords:
        • From PDF = 999 (used for positive ID of concept classes)
        • From Java = 97 (used for negative ID of concept classes)
      • Mapping : (uses root word filter on classes)
        • With vowel filter
          • 396 concept classes (5 characters)
          • 86 non-concept classes (3 characters)
          • 355 after subtracting non-concept matches
        • Without vowel filter
          • 684 concept classes (5 characters)
          • 257 non-concept classes (3 characters)
          • 524 after subtracting non concept matches ({concept} – {non-concept} = {new concept})
        • With vowel and prefix filter (“action”, “association”)
          • 320 concept classes (5 characters)
          • 85 non-concept classes (3 characters)
          • 279 after subtracting non-concept matches
      • Results compared to Oracle:
        • With vowel filter
          • 355 predicted
          • 126 actual
          • 3 matches ???? clssdgrm, dplymntdgrm, sqncdgrmWithout
        • Without vowel filter
          • 524 predicted
          • 126 actual
          • 37 matches -> 29% of actual, 7% of predicted
        • With vowel filter and prefix filter (“action”, “association”)
          • 279 predicted
          • 126 actual
          • 3 matches ????

Thesis Proposal

  • March 31, 2009
  • April 6, 2009
    • Thesis Proposal version 2
      • Includes an extra diagram to show the filtering process.
  • April 21, 2009
    • Thesis Proposal version 3
      • More developed section on Machine Learning
      • More information put in the introduction.
  • April 22, 2009
    • Thesis Proposal version 4
      • Includes updated sections to include text-to-ontology stage
      • Diagrams updated to include the ontology element of the proposal.
  • April 23, 2009
    • Thesis Proposal version 5
      • Includes section on Sartipi's work. Reverse Software Engineering through data mining
      • Include tool evaluation method using Gueheneuc's framework.
  • April 28, 2009
    • Mini-Proposal Presentation PDF
  • May 05, 2009
    • Thesis Proposal version 6
      • Reworked sections 1, 2, and 3.1
      • New System Diagram showing Active Learning
  • May 06, 2009
    • Thesis Proposal version 7
      • Formally described filtering and active learning processes
      • Added a conclusion
  • May 09, 2009
    • Thesis Proposal version 8
      • Added set diagrams to supplement the section on filtering and active learning
  • May 12, 2009
  • May 14, 2009
    • Thesis Proposal version 10
      • Uses peripheral rather than non-domain
      • Includes comments on the accuracy achieved in preliminary experiments
      • Include paragraph on contributions of the research
      • Includes other corrections from previous draft
  • September 1, 2009
    • Thesis Proposal version 11
      • Includes corrections from Dr. Gannod
        • References for the difficulty of creating a training set
        • Definition of View
        • Ontology tools cited
        • Maurice's tool cited
        • Table of diagram package classes
      • Still need to work on the following
        • Thesis statement
        • Loop back arrow in diagram?
        • Conclusion comment "We will develop tools to support this methodology and perform ..."

Suggestions for Improving Paper for ICPC

Problem Correction Completed
Section 1 wanders around covering a wide range of topics. Revise so that it focuses on specific (open) challenges the paper is addressing and the novel techniques applied to resolve this challenge. no
Section 2: abstract discussion of machine learning difficult to see how it applies to the problem of design recovery. Make the relationship between machine learning and design recovery more clear. no
Section 2: Related work is presented too early in the paper. Difficult to compare related work when our method has not yet been described. Put related work after the approach has been presented in detail. "Background" can be integrated with the introduction section to provide context of the paper. no
Section 3: Investigate "learneris" and "domcain" Fix spelling errors no
Section 3: High level discussion makes it difficult to assess whether the approach is realistic in practice. Use a case study example to use throughout the paper. no
Section 3: Long and meandering Tighten up the structure by describing 4-5 key design challenges associated with automated design recovery and then explain how our approach resolves these challenges more effectively than alternatives. no
Section 4: Unclear whter the results from the case study are representative/significant. Provide a "threats to validity" discussion. no
Section 5: What were the pros and cons of developing and applying our approach. Provide a "lessons learned" summary no
Article is too repetitive. Rework. no
An argument is not provided on why it is even possible that classification based on 14 (non-specified) object oriented metrics. List the 14 object oriented metrics and discuss why they might be good metrics for this classification problem. no
Background material is unnecessary. Background material should be focused to the concrete methods used in the work described. no
Why use SVMs? What is it's role in this work? Provide argument or discussion why SVMs were chosen over other learning algorithms. no
What is a parameter vector theta? Tighten up discussion on Machine learning. no
Related work only has one reference and it is far from the approach of the paper. ??? no
Substring matching method is not described. It may lack the consistency needed? Describe in detail what is happening with the substring matching. Use linguistic methods? no
The phrase "the approach uses active learning to build an optimal training set" is too strong Use the word "better" instead of "optimal" no
Results are not impressive for PANDA and get worse for ArgoUML Improve discussion and argument on why the results are acceptable for this task. no
Unclear why results got worse from Panda to ArgoUML. This is important because ArgoUML is a more complex system then Panda and the method should scale. Provide experimental statistics to help determine the relevance of these results. no
Concept assignment problem introduced at the beginning is not addressed in this paper. Improve discussion on why the concept assignment problem is used to present a variation of itself. no
Experiments are do not show what understandability was gained. Provide more realistic experiments that include psychological evaluations of understandability gained when using these tools in large, complex systems. no
Personal tools