Post by batmanbury on Mar 13, 2014 9:29:36 GMT
I noticed the activity on this board declined pretty quickly, so I hope everyone is at least using the discussion forums on the edX.org website.
I wanted to show something I found interesting about my take(s) on Problem Set 1, Excercise 3: Alphabetical Substrings. I took this course last semester, so I mainly wanted to take it again to see if and how my skills improved, and how my thoughts on problem solving changed. Since the due date for this PSet is past, I hope nobody minds my posting the code in its entirety.
For this problem, we're supposed to write a program that prints the longest substring in a string in which the letters occur in alphabetical order. For example, if s = 'azcbobobegghakl', then the program should print:
In the case of ties, print the first substring. For example, if s = 'abcbcd', then your program should print:
Here are my two solutions, you can compare them yourself, then I'll talk about what I did below.
Oct. 2013 Solution:
Mar. 2014 Solution:
You can see the 2013 solution is clearly 'shorter' but that really has nothing to do with how efficient it is, or rather, isn't. The main loop is a nested for loop which begins by iterating over each letter of the initial string, s, then does so again (hence the nested loop) in order to take 'slices' of increasing size starting at each letter. Basically, it takes every possible substring, of every possible length and stores them in the 'strings' list. This is terribly inefficient and unnecessary, but it worked at the time, and it was all I could think to do. The second loop goes through each of the stored substrings, then checks them against a sorted version of the same substring, which means if the substring is 'qwerty,' it will sort it (alphabetize it) and check if they are both the same. 'Qwerty' is not the same as 'eqrtwy' so that string would NOT get added to the 'alphas' list. If they are the same, for example 'eggs' == 'eggs' so that would go into the 'alphas' list. By the time this finishes, I have two lists and haven't accounted for ties. On the last line, I print the longest substring with max(alphas, key=len), which just luckily happens to result in the correct substring even if there's a tie (by virtue of how Python iterates through lists). Like I said, not great, but it worked at the time.
For the 2014 solution, I started without referencing my old code. I created a function which initializes two empty strings, 'longest' and 'current.' Then I enter the main loop iterating over the length of the input string. The inner loop goes over each character in the string, checking to see if adding one additional character will result in an alphabetical string. If it does, then 'current' will continue to grow. For example, if the main string is 'alphabet' then 'current' will go from 'a' to 'al' to 'alp' but once it hits the 'h' it doesn't get added. Instead, if the length of 'longest' is shorter than 'current' (which it is) it takes that value. At this point, 'longest' = 'alp' then current gets reset to an empty string, and the inner loop breaks to go to the next letter. To continue the example, we'll now check substrings that begin with 'l' then 'p' then 'h' which offer nothing alphabetically longer than 'alp.' But, at the end, 'current' will build up to 'abet' which then replaces 'longest' and the we exit the loop before returning 'abet' as the longest substring.
So in the new code, I managed to solve it without storing anything in memory larger than a string, whereas I was creating these potentially longer and longer lists with my old code. I was also storing non-alphabetical substrings that had no reason to be stored. As far as ties, the new code will naturally choose the first appearing substring because of that greater-than sign (>). Only longer substrings will replace 'longest.'
So, maybe there's not many people here to see it, but I hope someone thinks it's as interesting as I do. I was surprised to see how much my code changed after just a few months. If you want to grow and develop your skills, just keep taking classes, program as often as you can. I think the important thing for me here is that BOTH times around this problem I thought I did the best I could do at the time. Clearly the second solution is better, but there's obviously an even better solution out there than mine. There are endless ways to create any given program, and only the time you put into programming will help you imagine the cleaner or more efficient or elegant implementation.
I wanted to show something I found interesting about my take(s) on Problem Set 1, Excercise 3: Alphabetical Substrings. I took this course last semester, so I mainly wanted to take it again to see if and how my skills improved, and how my thoughts on problem solving changed. Since the due date for this PSet is past, I hope nobody minds my posting the code in its entirety.
For this problem, we're supposed to write a program that prints the longest substring in a string in which the letters occur in alphabetical order. For example, if s = 'azcbobobegghakl', then the program should print:
Longest substring in alphabetical order is: beggh
In the case of ties, print the first substring. For example, if s = 'abcbcd', then your program should print:
Longest substring in alphabetical order is: abc
Here are my two solutions, you can compare them yourself, then I'll talk about what I did below.
Oct. 2013 Solution:
strings = []
alphas = []
for j in range(len(s)+1):
for i in range(len(s)):
if s[i:j] != '':
strings.append(s[i:j])
for substring in strings:
if substring == ''.join(sorted(substring)):
alphas.append(substring)
print max(alphas, key=len)
Mar. 2014 Solution:
def subcount(s):
longest = ''
current = ''
for i in range(len(s)):
for c in s[i:]:
if current + c == ''.join(sorted(current + c)):
current += c
else:
if len(current) > len(longest):
longest = current
current = ''
break
else:
current = ''
break
return longest
print 'Longest substring in alphabetical order is: ' + subcount(s)
You can see the 2013 solution is clearly 'shorter' but that really has nothing to do with how efficient it is, or rather, isn't. The main loop is a nested for loop which begins by iterating over each letter of the initial string, s, then does so again (hence the nested loop) in order to take 'slices' of increasing size starting at each letter. Basically, it takes every possible substring, of every possible length and stores them in the 'strings' list. This is terribly inefficient and unnecessary, but it worked at the time, and it was all I could think to do. The second loop goes through each of the stored substrings, then checks them against a sorted version of the same substring, which means if the substring is 'qwerty,' it will sort it (alphabetize it) and check if they are both the same. 'Qwerty' is not the same as 'eqrtwy' so that string would NOT get added to the 'alphas' list. If they are the same, for example 'eggs' == 'eggs' so that would go into the 'alphas' list. By the time this finishes, I have two lists and haven't accounted for ties. On the last line, I print the longest substring with max(alphas, key=len), which just luckily happens to result in the correct substring even if there's a tie (by virtue of how Python iterates through lists). Like I said, not great, but it worked at the time.
For the 2014 solution, I started without referencing my old code. I created a function which initializes two empty strings, 'longest' and 'current.' Then I enter the main loop iterating over the length of the input string. The inner loop goes over each character in the string, checking to see if adding one additional character will result in an alphabetical string. If it does, then 'current' will continue to grow. For example, if the main string is 'alphabet' then 'current' will go from 'a' to 'al' to 'alp' but once it hits the 'h' it doesn't get added. Instead, if the length of 'longest' is shorter than 'current' (which it is) it takes that value. At this point, 'longest' = 'alp' then current gets reset to an empty string, and the inner loop breaks to go to the next letter. To continue the example, we'll now check substrings that begin with 'l' then 'p' then 'h' which offer nothing alphabetically longer than 'alp.' But, at the end, 'current' will build up to 'abet' which then replaces 'longest' and the we exit the loop before returning 'abet' as the longest substring.
So in the new code, I managed to solve it without storing anything in memory larger than a string, whereas I was creating these potentially longer and longer lists with my old code. I was also storing non-alphabetical substrings that had no reason to be stored. As far as ties, the new code will naturally choose the first appearing substring because of that greater-than sign (>). Only longer substrings will replace 'longest.'
So, maybe there's not many people here to see it, but I hope someone thinks it's as interesting as I do. I was surprised to see how much my code changed after just a few months. If you want to grow and develop your skills, just keep taking classes, program as often as you can. I think the important thing for me here is that BOTH times around this problem I thought I did the best I could do at the time. Clearly the second solution is better, but there's obviously an even better solution out there than mine. There are endless ways to create any given program, and only the time you put into programming will help you imagine the cleaner or more efficient or elegant implementation.