Data Science Coding Efficiency Checklist

In the last two years, I’ve learned a lot of about how to code efficiently.

I think learning the basics of efficient development is important, but I didn’t know what it meant two years ago. So I put what I considered today “basics” into a checklist. It is borrowed heavily from how people do things in software development and I used most of them daily in my workflow.

The checklist

Git

  • I know how to ignore data and files in my project.
  • I know how to create a new branch when developing a new feature.
  • I know how to merge branches and resolve merge conflicts.
  • I know how to discard or fix bad commits.
  • I know how to squash commit messages to prevent clutter.

Projects

Tooling

  • I know how to use my IDE or text editor efficiently for my project:
    • I know how to look up a function or a class.
    • I know how to rename a method or a class for all occurences.
    • I know how to move a function or class to a different file.
    • I know how to use a linter with style guides.

Distribution

  • I know how to pin down project dependency for development.
  • I know how to create the development environment with dependencies for a project
  • I know how to package my code for distribution.

Testing

  • I know how to run tests for existing projects.
  • I know how to use a testing framework.
  • I know how to write a test to complete a bugfix and prevent the bug from happening again in the future.

Misc

  • I know how to use a profiler to look for bottlenecks.

What has helped you to become more efficient as a coder? Please share in the comments or tweet at @ChangeLeeTW and let me know.

Written on October 12, 2019