Google Summer of Code 2018 Work Product Submission



coala

Ishan Srivastava

I am a third year student (expected graduation date: May 2020) of Computer Science and Engineering at Indian Institute of Technology Dharwad. I participated in GSoC and worked with coala to implement the coala-quickstart Green Mode for the repository coala-quickstart to generate project dependent configuration files for the coala project, which make coala show no inconsistencies in the code base and are helpful in detecting inconsistencies in future commits. I also solved and contributed small patches to other coala repositories during the time period of GSoC.


Patches Tarball


SHA-256:

54bc146da5f1f958991087d7457d58e2fa39a539856fb8da7fd65245099d5345

Bonding

Phase 1

Phase 2

Phase 3


Links to commits and repositories I've worked on:

Repository Link to Commit/s Description
p  projects View

[Bonding Period Work] quickstart_green_mode.md: Add Quickstart Green Mode Project

p  projects View

[Bonding Period Work] .gitignore: Add .DS_Store

c  cEPs View

README.md: Add green mode cEP

c  cEPs View

cEP-0022.md: Add cEP for quickstart green mode

c  cEPs View

CODEOWNERS: Add a list of developers

c  coala-quickstart View

[Bonding Period Work] EditorconfigParsing.py: Correct function name

c  coala-quickstart View

[Bonding Period Work] .coafile: Remove the deprecated [Default] section

c  coala-quickstart View

[Bonding Period Work] .coafile: Enable Quotes Bear

c  coala-quickstart View

[Bonding Period Work] README.rst: Change instructions for dev version

c  coala-quickstart View

Add 3 test bears

c  coala-quickstart View

Add 4 test bears

c  coala-quickstart View

Add some general purpose functions and tests

c  coala-quickstart View

Add SettingsClass and tests

c  coala-quickstart View

Add the green_mode tag

c  coala-quickstart View

SettingsClass.py: Remove unused function argument

c  coala-quickstart View

SettingsClass.py: Parse generate_config() also

c  coala-quickstart View

SettingsClass.py: Get optional settings of deps

c  coala-quickstart View

FileGlobs.py: Recursively look for gitignore files

c  coala-quickstart View

Add GREEN_MODE_COMPATIBLE_BEAR_LIST

c  coala-quickstart View

Add bear_settings.yaml

c  coala-quickstart View

Add QuickstartBear

c  coala-quickstart View

.moban.yaml: Increase pytest timeout

c  coala-quickstart View

Add filename_operations.py

c  coala-quickstart View

Project.py: Fix bug while printing languages

c  coala-quickstart View

bear_settings.yaml: Add green_mode.py

c  coala-quickstart View

green_mode.py: Run BEAR_DEPS bears while testing

c  coala-quickstart View

Add green mode incompatible bears list

c  coala-quickstart View

Project.py: Ask to select languages

c  coala-quickstart View

Aggregate green mode per file results

c  coala-quickstart View

green_mode_core.py: Fix a bug

c  coala-quickstart View

Project.py: Fix bug while printing languages

c  coala-quickstart View

bear_settings.yaml: Fix wrong bear for settings

c  coala-quickstart View

green_mode.py: Run BEAR_DEPS bears while testing

m  mobans View

appveyor.yml.jj2: Force pip 9

m  mobans View

coala-setup.py.jj2: Add maintainer_list

c  coala-utils View

Setup.py: Edit the maintainer field

c  coala-utils View

Add function for sorted glob.glob() output

c  coala-utils View

FilePathCompleterTest.py: Fix test


Quickstart Green Mode

Work Done

During the first phase my main objective was to collect metadata from the various bears for coala to know about the type of values they accept for their settings. I grouped the settings into various types eg. the ones which accept bool values, the one which accept int or float i.e infinite set of values or the ones which accept some discrete set of strings like the error codes for pycodestyle. I then collected these settings as a dict with all possible acceptable values as values to the dict, and made all combinations of setting values as those bears. I then ran coala again and again until i got a set of settings which produce no errors. I also added some bears like the QuickstartBear which aided me in guessing the values of those settings which can potentially take an infinite set of values. The last phase was involved with creating coala configuration files i.e. .coafiles out of the collected green (i.e. non erroneous data)

Challenges

Seperating the bear settings was a challenge in itself as it was not scalable due to missing type annotations / defualt values in many bears. It was impossible to guess whether a setting would take infinite values or a discrete set of values just from the type annotations. For eg. tab spaces don’t take infinite values but maximum line lengths for files on a particular project can. Things were getting discussed about how we could use type annotations in bears to specify the default values, leading to an easy identificaton of discrete type settings.

The next challenge was launching the bears in a multiprocessing environment and to decrease the run time of the coala-quickstarts’s --green-mode as much as possible. Dealt with generators along with multiprocessing and got to know that they don’t go well together.

The last challenge was creating configuration files out of the green results and aggregating various settings into config file sections. Various tree data structues and Trie data structures were involved to aggregate the per file results (as we needed the config files to be as specific as possible) into globs and sections.

Work to be done

coala-quickstart can use a --lite-mode to decrease its run time where it selects only a set of files from all the files in the directory and run checks only on those smaller number of files.

Various files are ignored and the user can be asked a question whether they would like to fix their inconsistencies using coala and all the ignored files can be placed in the files field with all the section settings for them to run coala with just a single command: coala.

Tests should be improved by actually devising a way to clone a repo and run --green-mode on it to increase the project’s test coverage.

The config files can be produced in many ways. There is always a trade off between the length of the files field or the length of the ignores field or there may be infinitely many sections present. And anything that’s too much can overwhelm the user. Things need to be discussed whether different sub modes of --green-mode can be made or to reach some sweet middle ground between all the three factors.

https://bitbucket.org/snippets/ishanSrt/qedaoo