Recursion Exists For A Reason

I’ve survived another Clojure recurs or not recurs crisis. I understand why Clojure has sequences, but recursion is built into the language, and why its use represents a non-functional programming style is news to me. I have albeit helpful comments to that effect.

A lot of list like things can be abstracted into sequences,  without having to know what kind of sequence, so I am beginning to get why someone would want to use map and other sequence-oriented functions.

However, sometimes, you may want to take the difference between two similar spreadsheet reports. A value in one column  (a key) of Report A should be found in Report B in another column.

I spent a lot of extra learning time trying to take a list of keys present in Report A and see if they were also present in Report B using only Clojure’s sequence functions. Personally, I wanted to write a function to perform a match, return nil, if the match was found, or the search row if the match was not found. How hard is that?

I was frustrated, because it felt like all my logic was being re-routed, rather than part of it, like learning how to use map, filter, and so on.

The data looks like this:


[["0123456789" "SMITHFIELD" "HAM"]["1123456789" "LITTLE" "CHICKEN"] ...]

Here is the solution, which uses loop .. recur.


(defn ret-non-match-rows
"Expects a sequence of sequences, like what is returned from clojure-csv.
Returns nil if there's a match; else returns failing row."

[s-o-s cmp-col-idx inq-row-idx inq-row]

(let [inq-row inq-row]
  (loop [[row & remain-seq] s-o-s pos 0]
    (let [cmp-val (nth inq-row inq-row-idx nil)]
    (cond
    (not row) inq-row
    (= cmp-val (nth row cmp-col-idx)) nil
    :not-found (recur remain-seq (inc pos)))))))

(defn test-key-exclusion
"This function takes csv-data1 (the includees) and tests to see
if each includee is in csv-data2 (the includeds). This function
uses full rows, so ssn, lnam, and fnam can be reported."

[csv-data1 pkey-idx1 csv-data2 pkey-idx2 lnam-idx fnam-idx]

; This is a case for using the thrush ->> operator in Clojure.
; Insert each result into the last position of the next call.

  (->> (map #(ret-non-match-rows csv-data2 pkey-idx2 pkey-idx1 %1) csv-data1)
       (filter (complement nil?))
       (map (fn [row]
         (vector (nth row pkey-idx1 nil)
         (nth row lnam-idx nil)
         (nth row fnam-idx nil))))))

Used with a map statement, a sequence is returned that contains a lot of nils (hopefully) and a few non-nill entries. It is those entries that go into a missing report.

So, the moral of the story is, sometimes you want to traverse a sequence and stop when a condition, like a key match, has occurred. But do not give up on sequences.

Advertisements

13 Comments

Filed under Clojure

13 responses to “Recursion Exists For A Reason

  1. ohpauleez

    (nth …) is sort of a bad-smell in Clojure code. You might want to restructure your loop/recur to use first/rest and destructuring to make it a little more readable and maintainable.

    As previously said, a sequence based solution here would use filter (potentially remove also).

    • Octopusgrabbus

      As I’ve replied above, bad smell or not, I believe pulling out the nth column for comparison, or getting at the column another way, is the only way to solve this problem.

      As I noted in a second reply to the above comment, the two reports are not equal. That is you cannot compare them on a row by row basis. However, one column in each may have values that match.

      Hopefully, each report’s column values will always match. The goal is every key in report A column n should be found in report B column y and visa versa, where n and y may or may not be the same value.

  2. Just to provide an actual solution that relies entirely on the sequence functions:


    (defn ret-non-matching-rows [seqs]
    (filter (partial apply (complement =)) (apply map vector seqs)))

    At the REPL:


    user> (ret-non-matching-rows [[1 2 3 4 5] [1 2 6 4 5] [1 2 3 4 7] [1 2 3 4 5]])
    ([3 6 3 3] [5 5 7 5])

    If someone told you that recursion isn’t functional, they’re just plain wrong. Recursion is the preferred way to do traditional iteration in a functional style. What the person may have been trying to say is that recursion is not the preferred idiomatic solution for many problems that have been solved via recursion traditionally, mainly because of the existence of the fantastic sequence abstraction in Clojure.

    As you can see here, the solution provided by the sequence abstraction is much cleaner and, IMHO, easier to read, if only for the amount of code you have to read. It makes some assumptions like all of the sequences being of the same length and all of the entries being comparable, but they can be mitigated through slightly more coding. I did it this way because your solution doesn’t seem to compensate for those problems either.

    • Octopusgrabbus

      Thank you. I appreciate your answer, and believe I am following its sentiments. That is, try to use sequence functions in Clojure everywhere that it makes sense. I’ve learned a few computer programming languages in thirty years. I’ve learned one thing, there are very few absolutes in anything we do.

    • Octopusgrabbus

      My solution is that if for every value in one column of Report A matches once in a second column in Report B, then leave the row (containing column containing the value that was matched) out of the list of “missing” values.

      Your solution looks like the whole row will fail if it doesn’t match. For every row in report A, there will probably never be a match in Report B. All rows in report A will never be found in report B.

      • OK, let’s try again then. 🙂


        (defn ret-non-matching-rows [report-a-seqs report-b-seqs]
        (filter (comp (complement #(some (partial = %) (map first report-b-seqs))) #(nth % 2)) report-a-seqs))

        REPL:


        user> (ret-non-matching-rows [[1 2 3 4 5] [6 7 8 9 10]] [[1] [3]])
        ([6 7 8 9 10])

        Assumptions:

        1. We’re only ever dealing with 2 reports, A and B.
        2. Every element in the columns are comparable
        3. A report is a seq of seqs.
        4. We’re only checking that a particular value in a column of a row of A is present in any row of another column in B. If it’s there, throw the row away, if it’s not, keep the row because we have a ‘non-matching’ row.

        `some` is perfect for checking if a given predicate passes for some value in a collection. I use an `=` check here to verify that it exists in that column.

        I check for it’s existence in different columns by using nth to get at a particular column of report a and first to get at the first column of report b. You could just as easily use any of the other positional functions or even destructuring to get at those values. This is just a quickly hacked together version.

        The entire implementation is lazy so if you only ever want the first one you can call this function like `(first (ret-non-matching-rows report-a report-b))` and it will only do the work required to get you to a single non-matching row.

        How’s that look?

      • Octopusgrabbus

        What you said in 1., 2., and 4. sounds right. It sounds like in 2. that the whole row is being compared.

        In Report A, column 1 (0 indexed) I am interested in every Report A row, whose column 1 key value is not present in Report B column 0.

      • Octopusgrabbus

        Tim:

        This was a good solution. I’ve modified to allow for both indexes, given the function is called with Report A and B interchanged. That is I want all keys in Report B listed that don’t appear in Report A, as well as the reverse.

        (defn ret-non-matching-rowsX
        [report-a-seqs col-a-idx report-b-seqs col-b-idx]
        (filter (comp
        (complement #(some (partial = %) (map (fn [row] (nth row col-b-idx nil)) report-b-seqs)))
        #(nth % col-a-idx nil))
        report-a-seqs))

  3. ivant

    Recursion is functional, even the loop…recur variant. It’s considered lower-level machinery and can be harder to read.

  4. Cedric

    Recursion isn’t non-functional, but loop/recur specifically might be regarded as somewhat non-functional as it’s fairly clearly an iteration.

    There is, by the way, a way to stop on the first match without giving up your higher-order sequence functions, and that is (first (filter …)) (you can also use (first (remove …)) if that fits better). IF the input is lazy, this will stop realizing elements and consuming the input at the first match (or non-match, in the case of remove). It’s also not incompatible with seqs backed by I/O resources; just use (with-open [x (open-the-resource)] (first (filter (… (make-seq x) …)))) to release the resource after the result of “first” has been determined.

    • Octopusgrabbus

      Thanks for the reply. I’ll keep filter in mind. Even my loop .. recur function returned either the missing value (in its row) or nil, and I used filter to rid the nils, because I wasn’t going to report on nil, as nil entries represent success.

  5. Joe

    Who says recursion is non-functional? The Little Schemer book uses recursion all through the book. Haskell, which is considered the ‘most’ functional language has recursion as a part of the language.

    • Octopusgrabbus

      Recently, I posted a Clojure question on a well-known problem site, and got a comment about using a recursive solution. The way I wanted to solve my problem was best solved with a recursive solution to search for a match. That is because my problem required knowledge of indices in each row of spreadsheet data, a vector of vectors.

      I took the comment in the most positive light, which got me to go back and look at using the sequence functions more than I had been. I found the re-look to be beneficial.

      I’ve also had comments that loop .. recur is for really “low level” stuff. So, I’m taking a middle approach. Learn to deal with sequences better than I have, and do not rule out recursion.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s