Jan 012020
 

Today I’ve finished version 0.5 of my new Gem, Universal Track Manager.

It’s a plug-and-play Rails engine that you install into your Ruby on Rails application with just three simple steps (see the README). You can then immediately pick up your visitors’:

IP address
Ad campaign where they came from
the browser they are using

In my next version, I’ll add support for http referrer and more too. Give it a try today.

If you like the Gem, please ‘star’ it on Github or download it from RubyGems (you do that just by running bundle install). Also, consider supporting it today with a small contribution today through the Github sponsors program. Sponsors levels start at just $1/month.

Dec 122019
 

Today I’m announcing ‘a first look’ at my new Gem: Universal Track Manager. It’s an ambitious project that’s going to have nearly universal appeal and utility.

Visitors come to your site every day. Along with their visits, 4 key pieces of information come along for the ride:

— IP address
— browser name (which lets you infer operating system)
— UTMs showing if they clicked from another site, or if they came from online advertising (typically you can “auto-tag” your ad campaigns and your UTMs will be magically populated)
— Http referrer, which shows if they clicked directly from another site to your site (Even when no UTMs are set)

Universal Track Manager, a play-on-words that shares an acronym with “UTM Parameters,” is your one-stop shop to automatically pick up this information and stash it into your database. You can think of it like a built-in Google Analytics (without the fancy dashboard).

As if that weren’t ambitious enough, with a tiny bit of trickery I’m planning support for optional Viewport size (width X height), which can let you determine if the user is on a desktop or mobile browser. (coming soon)

I’m pleased to announce Version 0.0.3, the first version I’m dubbing as ‘public beta.’ Although this is production-quality code, it should be used with caution until it is no longer in BETA status. You are welcome to give it a whirl on your Rails projects today. With an easy 3-step installation into any Rails 4+ app and you can sit back and sweep up tracking info on your visitors.

*MOST* of the core functionality now works! This version 0.0.3 implements fully support for timestamping your visits, the user’s IP address and browser. Support for UTMs & HTTP referrer and more coming soon! If you are curious now’s a great time to try it out, please submit feedback via Github.

Links:

Github Repo

Rubygems page

 Posted by at 9:24 pm  Tagged with:
Aug 122019
 

1. deivid-rodriguez/byebug
Byebug is a fantastic debugger available for Ruby 2 (and presumably above). Drop

gem ‘byebug’

into your Rails app Gemfile and bundle install. In either your test run or your development run, write

byebug

on a single line of your app and voila. When you hit that line, you will drop into the debugger.

If you’re not developing a Rails app, you can include ‘byebug’ at the top of your Ruby file.

Full docs here.

2. pretty print (pp)

Pretty print is one of my favorite introspection weapons to help see variables more clearly.

pretty print, which you write as pp, prints out your object with each attribute on its own line. Take for example this Spree::Country object, shown here on the Rails console without pretty print

2.4.6 :010 > x
=> #<Spree::Country id: 232, iso_name: “UNITED STATES”, iso: “US”, iso3: “USA”, name: “United States”, numcode: 840, states_required: true, updated_at: “2019-05-19 17:16:07”, zipcode_required: true>

Now, with pretty print, the same object is conveniently displayed with each attribute as its own line. This is invaluably helpful when you have deep nesting of objects.

2.4.6 :009 > pp x
#<Spree::Country:0x00007fd8507ea358
id: 232,
iso_name: “UNITED STATES”,
iso: “US”,
iso3: “USA”,
name: “United States”,
numcode: 840,
states_required: true,
updated_at: Sun, 19 May 2019 17:16:07 UTC +00:00,
zipcode_required: true>

3. puts, .to_s, and inspect

OK, so we get a 3-in-1 here: When you call puts on an object, .to_s will be called and then output to your screen. So you should make your objects have a .to_s that is human readable, possibly even for use in, say, a drop-down menu or label. 

def class Person
 attr_accessor :first_name, :last_name

 def to_s
   “#{first_name} #{last_name}”
 end
end

inspect, on the other hand, is specifically for developers. In this method, you would write out as much information as you the developer (or next developer) want to see, including the keys (ids) of your objects if those will be helpful:

def class Person
 attr_accessor :first_name, :last_name

 def inspect
   “Person id: #{id} – first: #{first_name}; last: #{last_name}”
 end
end

Your objects should have both .to_s and .inspect on them, and you can try these universally named Ruby methods on other people’s objects to examine them. A well-formed codebase implements them or has helpful output for both of these.

4. .to_yaml
Pretty print’s cousin is the .to_yaml method, which will take your object and convert it into yaml. Take for example this arbitrary object, which you will notice contains a :ghi key that has a nested object as its value:

2.4.6 :023 > x= {abc: 1, def: 4, ghi: {ye: 6, nm: 3}}
=> {:abc=>1, :def=>4, :ghi=>{:ye=>6, :nm=>3}}
2.4.6 :024 > x
=> {:abc=>1, :def=>4, :ghi=>{:ye=>6, :nm=>3}}

.to_yaml on its own will produce a string that will output with newline characters, like so:

2.4.6 :025 > x.to_yaml
=> “-\n:abc: 1\n:def: 4\n:ghi:\n :ye: 6\n :nm: 3\n”

To make this more useful, try puts along with .to_yaml

2.4.6 :026 > puts x.to_yaml

:abc: 1
:def: 4
:ghi:
 :ye: 6
 :nm: 3

5. x.method(:_____).source_location

(where :_____ is name of the method — as a symbol — you are trying to search for)

OK, so the ultimate secret weapon of Ruby debugging is this little-known method that will magically — and I mean magically — tell you where a method was defined. That’s right, I mean the actual line number itself.

2.2.5 :002 > a.method(:hello)
=> #<Method: Apple(Thingy)#hello>
2.2.5 :003 > a.method(:hello).source_location
=> [“/Users/jason/Projects/nokogiri-playground/app/models/thingy.rb”, 2]

Look, ma, take a peek into my hard drive and you would find that the hello method is actually defined on the file at the full path /Users/jason/Projects/nokogiri-playground/app/models/thingy.rb on line 2.

Like magic it works for Rails and Gem code too, and is invaluable when you are ready to dive into the APIs you are working with.

6. x.methods
By default this method will return a list of all of the methods on an object. Watch out because you’ll get all the methods on the ancestor chain too.

In older versions of Ruby, you could use this method to examine the instance methods that were defined on this class only (excluding the ancestors), but unfortunately this no longer works.

If you pass this method false, like so:

x.methods(false)

…things get more interesting: then you get only the class methods defined on this object’s class itself. (Remember in Ruby class methods are defined with self.)

7. brunofacca/active-record-query-trace

An excellent gem that’s still a non-optional workhorse in my development practice – especially when debugging a legacy codebase. This gem will display for ALL of your SQL queries where in your Ruby or Gem code the active record update commands are coming from.

Follow the instructions in the Gem to create an initialize file and set it up. My only tip here that adds to the docs is that you’ll want to set the number of lines:

ActiveRecordQueryTrace.lines = 10

I find that when debugging a problem in my own Rails app I want this set to a lower number (like 5 or 10) and when debugging a problem in Gem code or in Rails I need this at a much higher number (like 50 or 100).

8. flyerhzm/bullet

Understanding N+1 queries is a significant litmus test that sets amateurs from the professionals. Bullet is like a magic bullet – literally named so – for finding your N+1 queries.

Bullet is a great gem that you should install in either development or test, not in production. Often because it does add overhead to your speed, I install it but leave it configured so that it is turned off by default and any developer on the team who needs it can turn it on.

You CAN and SHOULD turn it on periodically too, to examine where your app is producing N+1 queries.

Here’s the catch with Bullet: Remember that Active Relation objects are created as chains of conditions before they get translated and executed as SQL. That’s why when you do this you must carefully consider

query = Country.where(name: “United States”)

When you do this in your Rails console, you will see it run the SQL right away.

2.4.6 :039 > Spree::Country.where(name: “United States”)
 Spree::Country Load (0.7ms) SELECT `spree_countries`.* FROM `spree_countries` WHERE `spree_countries`.`name` = ‘United States’ LIMIT 11

It only does this because of the ‘print’ effect the Rails console has on your objects. If you string another .where onto this object, when translated into SQL, ActiveRecord will combine the queries:

query = Spree::Country.where(name: “United States”); query.where(iso: “US”);
 Spree::Country Load (1.0ms) SELECT `spree_countries`.* FROM `spree_countries` WHERE `spree_countries`.`name` = ‘United States’ AND `spree_countries`.`iso` = ‘US’ LIMIT 11

The reason this is important is that to understand where your N+1 queries are coming from you need to understand when you are creating your Active Relation objects and when they are invoked. They are not the same place, although on the Rails console it makes you think it is one and the same. When you grok this, you will see why Active Record’s side loading (which loads a related set of objects in a single optimized query, taking the number of queries from N+1 to 1+1=2) is both efficient and can be tricky to work with, especially with objects that have many relationships.

Don’t be fooled: Side-loading is nearly always more efficient than N+1 queries. 

Bullet tells you where those pesky N+1 queries are invoked, but not where you are creating them. What you then need to do is trace your code (manually) to figure out where the queries are being created, which hopefully should be near in the code to where they are being invoked ( but in the case of complex filtering logic might not be).

Here you need to add the appropriate .joins(:____) to your code anywhere between when the objects are set up and when they are invoked by Active Record. Note that you’ll only want to join in those additional tables if they are actually used. If not, you don’t need the over head.

You’ll know you’ve solved your N+1 queries because you won’t see them output in your Rails log, and they’ll disappear from the Bullet code.

9. better_errors

Since Rails 4 adopted a near identical default, this used to be more interesting. For a Rails 3 app it can bring your error crash console up to Rails 4 standards.

10. Introspect, Introspect, Introspect but remember Ruby’s last-line quirk

Always look at what you’re doing. Drop into your debugger, look at your variables, clone & freeze them, look for race conditions, look for flip-floppers, watch out for your own confirmation bias. Remember that when in the debugger or on the Rails console and you type a SINGLE VARIABLE and HIT RETURN, the console will interpret that action as-if you had called .inspect

2.2.5 :007 > a
=> #<Apple id: nil, created_at: nil, updated_at: nil>
2.2.5 :008 > a.inspect
=> “#<Apple id: nil, created_at: nil, updated_at: nil>”

In some cases, the act of inspection actually changes the object (like in the case of an Active Relation, in which case it invokes the query), so keep that in mind (we might call this the “observer effect” in software development.)

Feb 102018
 

1. Create an ErrorsController in app/controllers

class ErrorsController < ApplicationController
 def not_found
  respond_to do |format|
   format.html { render template: “errors/not_found”,
              layout: “layouts/application”,
              status: 404 }
  end
 end

 def server_error
  respond_to do |format|
   format.html { render template: “errors/server_error”,
              layout: “layouts/application”,
              status: 500 }
  end
 end
end

2. Add these to your routes.rb file

match “/404”, :to => “errors#not_found”, :via => :all
match “/500”, :to => “errors#internal_server_error”, :via => :all

3. Add this to your application.rb file

config.exceptions_app = self.routes

4. Delete public/404.html, public/422.html, and public/500.html

5. Remember while developing you should change this to false in config/environments/development.rb

config.consider_all_requests_local = false

If you fail to perform this step, Rails will show you full stacktraces instead of your error page.

Sample app can be found here

Sep 032017
 

Today I’ll take a moment to expound on how web development has changed over the last two decades. Long ago, when we started back in the 90s, connections were slow and web pages didn’t change much.

In the design of the internet itself is something you should be familiar with if you are reading this post: browser caching. Continue reading »

Jul 172017
 

Sometimes in the life of a hybrid Rails-Javascript app you may want to do something unique: have a config file written in YAML available to you in your Javascript code.

A simple trick will make Sprockets, the Asset Pipeline in Rails 4+, do this automagically for you. This example comes from https://github.com/rails/sprockets/issues/366

I’ve created an example app that you can read the source or see live demo here.

First, we’ll need to create a special hook for Sprockets called “depend on config”. Create a file at lib/process_depend_on_config.rb

Sprockets::DirectiveProcessor.class_eval do
 def process_depend_on_config_directive(file)
  path = File.expand_path(file, “#{Rails.root}/config”)
  context.depend_on(path)
 end
end

Now, in your Sprockets-managed Javascript, use this directive before you include Ruby evaluation inside of javacript

assets/javascripts/pull_yaml_into_json.js.erb

//= depend_on_config ‘my_configs_in.yml’

ExampleApp = {};

ExampleApp.MyConfigs = {
 getConfigs: function() {
  var mySettings = <%= YAML.load_file(“config/my_configs_in.yml”).to_json %>;
  return mySettings;
 }
};

Finally, for demonstration purposes, create a file at config/my_config_in.yml, you actually have the YAML configuration you want to port from Ruby to JSON, something like

hello:
 world: 12345
 country: 678
 state: 90
 city: 11

The depends on config directive tells Sprockets to invalidate the cache for the resulting JS file when the YAML file changes, hence why it is needed here.

Voila! Your Ruby-based output (here, a YAML config but theoretically could be anything) is now included in each build during the Sprockets compilation phase.

Jun 052016
 

If you’re a Ruby or Rails developer looking for some advice on how to get better at integration testing: congratulations! You’ve reached the highest level of difficulty in all of the areas of the stack you must conquer to become a great Ruby developer.

Integration testing is hard, but it doesn’t have to be. This the subtle of this truth lies in the fact that you must be skilled in both the backend and front-end of your app: you must understand factories and your Ruby objects, and if you have a Javascript-heavy app, the deep fundamentals of Javascript as well.

First things first, you will want to learn how to debug in both the front and back-ends. For the sake of this post, I’m going to assume you have learned a backend debugging tool like byebug. If not, try this tutorial now.

Second, you need to know that Capybara is a syntax for writing Ruby – “a DSL” – for telling a browser what to do. It can work against several of different browsers – Firefox, Chrome, and ‘headless’ browsers you can’t see. If you use a browser you can see, you get the neat effect of being able to view your results as they run, which can be fun (and you should do it) but may not work on a Continuous Testing / Continuous Integration platform.

A headless browser (of which I will discuss two: webkit and poltergeist) is complex to debug, and requires a command of all the parts of the stack.

Occasionally, you may write some Javascript code that will work in one browser and not another (you should learn to avoid this) – that’s why you can run Capybara with a single syntax against many different browsers.

The bad news is, in short, despite it being 2016 and Rails having been around for nearly 12 years none of the drivers is perfect.

Sometime ago I wrote about a neat little trick to view console messages while debugging Capybara webkit.

Driver’s name Browser The Bad The Good
Selenium Firefox Firefox doesn’t let you paste into the console
Chrome Chrome The Chrome debugging experience has some annoying gotchas. Don’t try to open the debugger while your spec is running, unless you pause on the back-end (for example, byebug). If you do pause on the Rails-side, you should be able to also fall into the debugger on the Chrome driver side too. If you do actually manage to open Developer Tools, you can reasonably debug your Javascript
webkit headless There are problems with PATCH requests when using this legacy headless driver. Take note that this PATCH problem was fixed in PhantomJS version 2.0. Webkit also requires you install QT on your system. Webkit lets you inspect status codes using driver.status_code and as mentioned in the post above, console messages too.
poltergeist headless If you app makes PATCH requests, note that poltergeist needs you to be running on Phantom JS 2.0 or higher to be able to process PATCH requests corruptly (when they aren’t, they come through on the server side as empty requests)
By default, anything that is sent from your Javascript as a console message makes your spec run fail (this can be turned off).
“The Worst Except For All the Others” (as Churchill said). This is the one I use primarily. Your console.log output is automatically ported from your Javascript into your test results.

Here’s a list of other notes of things to keep in mind.

  1. You should be using Capybara version 2.7.1 or higher. Earlier versions do not wait for all sessions to close before kicking off Database cleaner’s truncation. When truncation happens before all sessions are closed, bad things happen (like intermittent failing tests). Waiting and timing is explained in detail below.
  2. This applies to you if you app makes PATCH requests: Make sure you are on Phantom JS 2.0 or higher. Note this is a binary to install and on CI server it probably is a global (shell) configuration. (On ours, Semaphore, you need to specify the Phantom JS in the global build commands, not just in your Gemfile.) You to be running on Phantom JS 2.0 or higher to be able to process PATCH requests corruptly. When they aren’t, they come through on the server side as empty requests, which can lead to unexpected results.
  3. Capybara-webkit sucks. It just does. Don’t use it. The intermittent issues alone are enough to throw it out. Use Poltergeist instead. It was an older technology and by and large it has been replaced by Poltergeist. Experienced developers know this and don’t use webkit for this reason. Junior developers fight in vein trying to get webkit to work and waste lots of time believing in something that simply is a shitty piece of technology.
  4. When working with ChromeDriver note that it is annoyingly difficult to open the Developer Tools while the test is running. This is a knonw-issue, and the Chrome developers advise you pause your test to open Chrome Dev toos. This is explained here.
  5. When using Database cleaner with Truncation, Make sure you have it in an append_after hook and not in config.after(:each) (several tutorials will mistakenly lead you down the wrong path here.) It should look like this:
    config.append_after(:each) do
     DatabaseCleaner.clean
    end
  6. Prefer transaction instead of truncation for all non-Javascript tests (unit tests, controller tests, etc). For Javascript integration specs, you need truncation. An explanation about why can be found at https://github.com/DatabaseCleaner/database_cleaner#rspec-with-capybara-example
  7. Use Factories and don’t use fixture data. Fixture data can lead to brittle tests. Generally the entire Rails community has learned from the Dark Days and recommends factories over fixtures.
  8. Don’t use connection pooling. Some people on the internet will tell you to use connection pooling to solve thread-locking problems – don’t listen to them. Capybara already has dealt with this under the hood, make sure you are on a recent version of Capybara.
  9. Avoid using .trigger. Sometimes if an element isn’t visible Capybara will advise you when it fails you can ‘work around’ the element not being on the page by referencing the element and calling .trigger. You’re just trying to get around the on-screen-and-visible enforcement by Capybara, but this isn’t a good idea. If the thing isn’t on the screen and visible, it probably means there’s a bug and you want to catch that as a failure. Remember your tests are only as valuable as what they catch when things mess up.
  10. Circular Dependancy when trying to load ___
    This development issue that causes race condition (intermittent) failures has been explained on this Thoughtbot blog post.

    To fix if you’re on Rails 4.1 or prior, set allow_concurrency = false in test.rb (Rails 4.1 + earlier only)

    Set this in your config/environments/test.rb file set this:

    config.allow_concurrency = false

    You do not need this if you are on Rails 4.2 and above.

Timing

Timing is super hard to debug, but there’s an art to it. Tame your Capy specs like a lion tamer. Make them jump through hoops and bedazzel them to calm them down. You need to understand 3 things: Capybara’s native waiting, (2) a wait helper, and (3) an explicit sleep.

Capybara Native Waiting Behavior
If you’re on Capy 2.7 understand that Capybara natively waits for content on a page when you assert it to be there, even when Ajax and rendering might not have it ready to be there at the very moment the assertion runs. Thomas Wolpole, author of Capybara, advises me:

The way 2.7.1 is handling this is through middleware that keeps a counter of any current requests being processed by the app. First it tells the browser to visit about:blank and waits for that to happen, at which point the browser should not be initiating any more requests to the app. Then it waits for the active request counter to be 0, and then continues on.

Instead of using sleep, use expect(page).to have_content(…) to wait for the content you want to appear. Specifically I believe that using expect/have_content waits for the page to have the content you want it to have, but expect/value/to eq does not actually wait. For this reason, sprinkle in some expect(page).to have_content even when you don’t have to just to get Capybara to pause until the page is re-rendered.

You often find yourself writing

expect(page).to have_content(“xxx”)

over and over again. Contrary to the instinct to not Repeat Yourself, that’s a good thing! If this really irks you may write yourself helpers to make this repeated step more encapsulated. What you are really doing is putting the UX through it paces, so think of it like a player piano instructions not like the code you so work on to make beautiful.

It will be easier to do this if your app has natural-language responses like “You have logged in successfully.” For this reason you should encourage your Product Owners/Stakeholders to put in such natural language indicators – it makes your site easier, safer, and your regression suite more solid. And your users will appreciate instant feedback it too. If your product managers insist on ‘silent’ feedback, remember you can use Capybara to assert that things are or aren’t disabled, grayed-out, etc.

Basically, although you can sometimes get away with expectations that do direct Ruby object lookup, you really shouldn’t, or should use it as infrequently as possible

Wait Helpers (Ruby metaprograms Javascript)

Sometimes you’ll see a spec failure that will pass if you add sleep 1 or sleep 2. Avoid this, but use a very fast sleep (I use 0.1) when necessary. Instead of sleeping, turn off animations and write wait helpers for yourself to pause until certain conditions are met.

You should use wait helpers to wait for:

– Ajax requests that Capybara doesn’t seem to pick up natively (Later versions of Capy are supposed to count the number of outstanding Ajax requests but I’ve had difficulty getting this to work consistently. You can and should assert content is on page and prefer Capybara’s native waiting to a wait helper)

– Your app is doing something like initializing (you can even write your app to set itself a global flag when initialization has finished which can be checked from Capy helpers)

Here’s an example of a wait helper that waits for an Ajax request. Note here we are using page.evaluate_script to metaprogram Javascript by way of Ruby code, waiting until a condition is met before continuing the spec.

def wait_for_ajax
 counter = 0
 while page.evaluate_script(“$.active > 0”)
  counter += 1
  print “_”
  $stdout.flush
  sleep(0.1)
  if counter >= 100
   msg = “AJAX request took longer than 10 seconds.”
   if page.driver.respond_to?(:console_messages)
    msg << ” console messages at time of failure: ” + page.driver.console_messages.inspect
   end
   raise msg
  end
 end
end

Here’s an example of a wait helper that would wait for your app’s own initialization cycle, provided yourApp is the variable in Javascript where you app is namespaced, and when it is finished with its own initialization cycle it sets _initialized to true (on itself). You can write your own wait helpers appropriate to things you app does.

def wait_for_your_app
 counter = 0
 while page.evaluate_script(“typeof(yourApp) === ‘undefined’ || typeof(yourApp._initialized) === ‘undefined'”)
  counter += 1
  print “~”
  $stdout.flush
  sleep(0.1)
  raise “Your app failed to initialize after 10 seconds” if counter >= 100
 end
end

Explicit Sleeps

When all else fails sometimes you just need a sleep, which you just do in ruby as sleep X where [X] is the number of seconds you want to sleep.

You should use sleeps very rarely, but I’ve found they are needed in these cases:
– After an Ajax request, sometimes a sleep is needed to let the database catch up. (I try to keep these at about 0.5 seconds)
– A small timing delay (no longer than 0.1 seconds) for your app doing something like re-rendering

In theory you can wait for anything, so try to use Capybara’s internal waiting mechanisms first. In this order, your toolkit is:

1. Capybara’s internal waiting
2. A wait helper (as explained above)
3. An actual explicit sleep (try to keep all sleeps under 0.2 secs)

Remember each knife is shaper than the next, and so you should strive for minimal intrusiveness, but know that a combination of all three is likely necessary. The more astray you go from the art the more likely you experience timing delays.

Warning for anyone who has an expires_in set as a cache-control header in your controller endpoints (html or json).

Yes you! Go look in your code right now for expires_in set in your controllers and if you have any pay attention.

As I documented here, you’ve got to watch out if you have endpoints that have non-Zero cache-control headers on them. The headless driver (poltergeist or webkit) will hang onto the HTTP response between specs. This can be detrimental to you, if, say, the content of that endpoint’s response is what you are testing. In my case, I just used an inelegant hack to work around this- suggestions for improvements welcome.

if Rails.env.test?
 expires_in 0.minutes, :public => false
else
 expires_in 3.minutes, :public => true
end

Conclusion

Try to keep your feature specs to about 1-3 minutes to run per file, also maybe split them off when they are about 200-300 lines long. Be mindful of the total run time – since they are so valuable you can afford a little leeway here but keep in mind it slows down you time to develop new features.

Be careful about assertions that reach back into the database. Although you can get it to work, reloading objects and asserting things have changed is prone to race conditions, particularly with database cleaner. Remember that you have two threads operating separately, and even if you are able to do .reload on the object to get it into the right state, it’s actually nearly always better when writing Capybara specs to just assert the UX has changed the way you think it will.

And finally: Patience, discipline, know that others have been here before you and others will come here again. You are on the pinnacle of Rails development – don’t fall! Patience and faith.

Suggestion or feedback? Log in with your Stackoverflow, Github, Facebook or Google account to leave a comment.

May 252016
 
def wait_for_ajax
 counter = 0
 while page.evaluate_script(“typeof($) === ‘undefined'”)
  counter += 1
  print “^”
  $stdout.flush
  sleep(0.1)
  raise “Jquery not initialized after 10 seconds.” if counter >= 100
 end

 counter = 0
 while page.evaluate_script(“$.active > 0”)
  counter += 1
  print “_”
  $stdout.flush
  sleep(0.1)
  if counter >= 100
   msg = “AJAX request took longer than 10 seconds.”
   if page.driver.respond_to?(:console_messages)
    msg << ” console messages at time of failure: ” + page.driver.console_messages.inspect
   end
   raise msg
  end
 end
end

Feb 082015
 

First you need to know how to connect to your Heroku application using bash. You can then use the du -h command to get a read on how big the files inside your slug are.

heroku run bash -a yourappname

Once you’ve done that, here’s the magic command that will show you how big each folder in your slug is:

du -h --max-depth=1

Max-depth of 1 will give you 1 level deep output of how big each of those root folders are. You can change the max-depth to see more granular introspection of each folder. (However, realistically, it is probably more useful to cd down to some subfolders and run du -h --max-depth=1 again.)

Some observations I have made:

1. Obviously you basically want to get as much out of your assets/ folder as you possibly can. Almost everything I do these days is tied to some kind of managed content, so I can safely put almost all the images and videos in my app into a Paperclip-backed data model. Using S3, Paperclip stores all the different sized versions it will render remotely, so all that stuff stays out of the slug.

2. Beware of Gem bloat. If you don’t use it, you probably want to get it out of there.

3. Generally speaking just keep large files out of your app. If you’re needing a lot of larger data files re-think your architecture to work in a more distributed, background job kind of way that doesn’t require the files to be in the slug. (Remember, worker dynos can always download from or upload to S3 and storage on S3 is much cheaper than keeping things in your slug.)

4. Use .slugignore (well documented so I won’t go into it here.)

5. I found that using gems directly from Rubygems seems to save a lot of space compared to using gems which point to a Github repo.

You can recognize a gem pulled from its Rubygems version because it usually has few or no parameters, or only a version parameter:

(pulling from Rubygems)

gem ‘spree’, ‘2.1.12’

It turns out that because Spree can be a little lazy about building the Gems, they recommend you just point your Gemfile to the github repo. (To be fair, they maintain a lot of bugfixes across several branches, and pointing to a branch on Github makes much more sense if you have software that is retroactively fixed like Spree)

(Pointing to a github repo)

gem ‘spree’, git: ‘https://github.com/spree/spree.git’, branch: ‘2-1-stable’

The difference here is that Bundler can just pull a compiled gemfile in the first example, whereas it has to download the repo and turn it into a Gem in the second example. In my app, the latter (pulling from Rubygems) saves me a whopping 70 MB compressed in my slug. Although I haven’t confirmed, I suspect this is because Bundler is actually very efficient at packaging up a Gem and excludes all the support files you don’t want when you create the gem, whereas the ‘lazy’ way of just pointing to a Git repo has the side-effect of including all the files in the repo in your finished slug.