In [1]:
#https://github.com/jbn/IPlantUML
import iplantuml

Revisit ICP

2019-07-04_12h33_54.png

  • Order was important
  • Starts with positive example
  • Each example can vary only in one important feature
  • How far to generalize? Its usualy to over generalize or under generalize in ICP
  • What if examples come in arbitrary order, and lack of bg knowledge.

Version spaces help here.

Abstract Version Spaces

This is how we converge as a process.

Version_Spaces_1.gif

Visualizing

ICP We take one positive model, and that moves towards speciality or generality extremes like below depending on order of example.

ICP_1.gif

VS We take 2 models. A special model and general.

For every positive example, we generalize the special model. For every negative example, we specialize the general model.

VS_1.gif

Note, Joyner called them as concept and also model interchangeably.

Generalizing & Specializing

Suppose we have Frame 1 as shown below. The only way to generalize it further is to make one of the specific attributes to any (indicated by ?). For example, any food instead of expensive food, is one generalization possibility.

2019-07-05_08h36_08.png

Similarly, the only way to specialize is to make one of general attribute, specific. For example, not any time, but only during lunch.

2019-07-05_08h37_42.png

Extreme Cases

Specializing an extreme general model

So for an extreme generalized model, say G1, what could be possible specializations? One of the attributes could be a specific value. So if there are 4 slots, there could be 4 specializations (one step at a time specialization).

Figure 1 Specializing an extreme general modelG1?:???GS1X:???GS2?:X??GS3?:?X?GS4?:??X

Generalizing an extreme special model

Here again, generalizing one slot at a time, we could generalize an extreme 4 slot specialized model, in 4 ways.

Figure 2 Generalizing an extreme Specialized ModelS1W:XYZSG1?:XYZSG2W:?YZSG3W:X?ZSG4W:XY?

Example

2019-07-04_18h11_59.png

Looking at it in human eyes, one coudl guess, the allgeric reaction is only when subject visits Sam's.

Step 1 - Positive Example

Establish the specific and generalized models initially from given positive example.

2019-07-04_18h13_43.png

Step 2 - Negative Example

For negative example, specialize the general model.

2019-07-05_10h50_32.png

Before we get in to that, let us generalize the extreme specialized model once. We did similar one in Figure 2 above. There are many possibile comibinations. To be exact, 3+3+7+2 = 15 combinations possible. We have shown only 4 combinations below. (AI by Winston, Chapter 20)

Figure 3 Generalizing S1S1Sam:breakfastFridaycheapSG1?:breakfastFridaycheapSG2Sam:?FridaycheapSG3Sam:breakfast?cheapSG4Sam:breakfastFriday?

For the extreme general model G1, specializing would mean, specific value for each of its slots, so 4 specializations possible like in Figure 1, but what to fill up with? We want one single converged model, so the specialization of G1 should march towards the extreme specialized model S1. Thus, our first step could be specializing G1 with values from S1. This is without even looking at negative example.

Figure 4 Specializing G1G1?:???GS1Sam:???GS2?:breakfast??GS3?:?Friday?GS4?:??cheap

Compare Figure 3 and 4. If we had continued generalizing S1 in Figure 3, somewhere, the models would be equal to those we just derived in Figure 4. Thus,

Each new specialization of G1, is a generalization of most specific model S1

This ensures convergence and we would follow this in every step. So, in general

Each new specialization is a generalization of most specific model

Pruning:

Now for given negative example, we do not want our model to select that as allergic combination because it is not so; the given negative example does not produce allergic reaction.

Figure 5 PruningEx2Kim:lunchFridayexpenseGS3?:?Friday?

Because of the generality, GS3 can still allow Ex2 as a valid allergic combo. We do not want our model to allow that. So we prune GS3. No other GSX frame have this problem with Ex2, so the specialized models of G1 would be the rest 3 at this step.

Figure 6 Final Special Models of G1 After PruningG1?:???GS1Sam:???GS2?:breakfast??GS4?:??cheap

This is what is illustrated below

VSEx_01.gif

Step 3 - Positive Example

2019-07-05_11h31_40.png

Step 1: Generalize the latest specific models on left side.

We have only S1. Trying to generalize it (specific slots take general values), we get Figure 3 as before. As earlier there are 15 combinations possible, but we could cut short by directly taking ones that is compatable with positive example. We want our model to select above example. Let us compare both exmaple and our S1.

Figure 7 : S1 and given positive exampleS1SambreakfastFridaycheapEx3SamlunchSaturdaycheap

We want our new eneralized model SG1 to be able to select this positive example. So it should generalize as lunch/breakfast and Friday\Saturday. In other words, simply by ? there.

Figure 8 Specializing S1S1SambreakfastFridaycheapSG1Sam??cheap

Step 2: Validate the latest specialized models on RHS

Now due to this new example, we should re valided GSX models with positive example. Because none of GSX should exclude this positive example. Do we have anything like that?

Figure 9 Latest Special Models GSX vs Positive ExampleG1?:???GS1Sam:???GS2?:breakfast??GS4?:??cheapEx3SamlunchSaturdaycheap

Note GS2 excludes Ex3, whereas it should not. We want our model to select Ex3. So we prune out GS2 here.

Figure 10 Final Special Models of G1 After Pruning to allow Ex3G1?:???GS1Sam:???GS4?:??cheapEx3SamlunchSaturdaycheap

Note now both GS1 and GS4 allows Ex3. With this we are done at this step. This is illustrated below.

VSEx_03.gif

Step 4 - Negative Example

2019-07-05_12h00_58.png

Step 1 We want to exclude this example in our RHS GSX models. Lwet us compare them.

Figure 10 Comparing latest GSX vs Ex4G1?:???GS1Sam:???GS4?:??cheapEx4BobBreakfastSundaycheap

GS1 already excludes Ex4. So its good there. GS4 includes Ex4. So we need to specialize it to exclude Ex4. We could do that as below.

Figure 11 Specializing GS4 to exclude Ex4G1????GS1Sam???GS4???cheapGS41Sam??cheapGS42?breakfast?cheapGS43??FridaycheapEx4BobBreakfastSundaycheap

However there is a problem here. GS42 and GS43 contradicts with SG1, by being specific about breakfast and Friday. This cannot be true (Note if so, GS42 would exclude Ex3, a positive example where the day was Saturday. We cannot allow that). So we prune away GS42 and GS43.

Figure 11 Valid Specialized GSX after complying with SG1G1????GS1Sam???GS4???cheapGS41Sam??cheapSG1Sam??cheap

Compare GS1 and GS41. GS1 subsumes GS41. So by rule, GS41 can be pruned away (Why not we do in other direction, that is remove more generalized GS1 instead of GS41 since we specialize from RHS?I do not know yet).

Figure 12 Final GSXS1SambreakfastFridaycheapSG1Sam??cheapG1????GS1Sam???

This is illustrated below.

VSEx_04.gif

Example 5 - Negative Example

2019-07-05_12h33_11.png

Again a negative example. As usual we got to most specialized general model, now the GS1, and specialize it further so as to avoid Ex5 negative example. However, at same time, should also ensure, the specialized model is inclusive of or not conflicting with most generalized special model SG1

Figure 13 : Current Situation With New Negative ExampleS1SambreakfastFridaycheapSG1Sam??cheapG1????GS1Sam???Ex5SamBreakfastSundayexpensive

I have multiple possibilties of specialization with 3 generalized slots in GS1, but I need to choose one that does not conflict with SG1. So the only specialization that cannot be in conflictw ould be one with cheap slot value, which also excludes Ex5 which is our priori requirement.

Figure 14 : Final ModelS1SambreakfastFridaycheapSG1Sam??cheapG1????GS1Sam???GS11Sam??cheap

GS11 and SG1 are now same, and thus converged. This is our final model.

VSEx_05.gif

Algorithm

2019-07-05_12h54_44.png

VS could have each example varying in many features as we saw above, not in ICP

Quiz

Step 1 - Positive Example

2019-07-05_12h57_54.png

Quiz Step 1Ex1KimBreakfastFridaycheapnoS1kimbreakfastfridaycheapnoG1?????

Step 2 - Positive Example

We should generalize the most specialized S1.

2019-07-05_13h03_50.png

Quiz Step 2 - before generalizationEx2KimlunchFridaycheapnoS1kimbreakfastfridaycheapnoG1?????

We need our model to generalize to include this 2nd positive example. So the only thing varying here to include is generalizing time - its lunch or breakfast.

Quiz Step 2 - after generalizationEx2KimlunchFridaycheapnoS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoG1?????

Step 3 - Negative Example

We should specialize the most generalized G1

2019-07-05_13h09_33.png

Quiz Step 3 - before generalizationEx3SamlunchSaturdaycheapnoS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoG1?????

Our GSX models should not include the given example. There are multiple possibiltiies filling up for G1, but we want only those that are not in conflict with SG1. Initially we see 4 possibilities which are those in compliant with SG1

Quiz Step 3 - before generalizationEx3SamlunchSaturdaycheapnoS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoG1?????GS1kim????GS2??friday??GS3???cheap?GS4????no

However, GS3, and GS4 would allow the negative example, so they should be pruned away.

Quiz Step 3 - after pruningEx3SamlunchSaturdaycheapnoS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoG1?????GS1kim????GS2??friday??

Step 4 - Negative Example

2019-07-05_13h20_29.png

It is a negative example, so we should specialize the general models GS1 and GS2 such that, this example is not allowed.

Quiz Step 4 - current situation with negative exampleEx4KimbreakfastsundaycheapyesS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoG1?????GS1kim????GS2??friday??

Due to numerous possibilities, considering SG1 compliance in mind at same time, new GSXX models excluding Ex4, we get about 3 possibitlies finally.

Quiz Step 4 - GSXX possibilitiesEx4KimbreakfastsundaycheapyesS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoG1?????GS1kim????GS2??friday??GS11kim?friday??GS12kim???noGS21??friday?no

There seems to be no subsuming as well., so we proceed.

Step 5 - Negative Example

2019-07-05_13h31_09.png

Quiz Step 5 - current situation with negative exampleEx5SambreakfastsundayexpensiveyesS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoG1?????GS1kim????GS2??friday??GS11kim?friday??GS12kim???noGS21??friday?no

Again we should specialize all general models, such that, they exclude Ex5, but compliant with SG1. But none of above GSXX allow Ex5 anyway. So I decide to do nothing for this example.

Step 6 - Positive Example

2019-07-05_14h06_52.png

We have to now generalize the most specfic model SG1. At same time, the GSXX should allow this model.

Quiz Step 6 - current situation with positive exampleEx6KimlunchsaturdaycheapnoS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoG1?????GS1kim????GS2??friday??GS11kim?friday??GS12kim???noGS21??friday?no

Generalizing SG1

  • Between SG1 and Ex5, the only generalization we see is it could be friday/saturday.

Checking if GSXX allows this model

  • Comparing Ex5 and GS11, GS11 would not allow this example, so we prune it (generalizing it would mean going backwards to GS1)
  • Comparing Ex5 and GS12, GS12 will allow this example, so shall remain
  • Comparing Ex5 and GS21, GS21 will not allow this example so we prune it

Combining the observations, we get,

Quiz Step 6 - After generalizing and pruningEx6KimlunchsaturdaycheapnoS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoSG11kim??cheapnoG1?????GS1kim????GS12kim???no

Step 7 - Negative example.

2019-07-05_14h16_22.png

This is a negative example, so we should specialize GS12, and prune as necessary. Also make GS12X models complaint with SG11.

Quiz Step 7 - Current situation with new negative exampleEx7KimlunchmondayexpensivenoS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoSG11kim??cheapnoG1?????GS1kim????GS12kim???no

Comparing GS12 with Ex7, in current form it allows. So we could have one of 3 general slots of GS12 to have a value specialized, so as to exclude Ex7. Looking at SG11, that specialized value could only be cheap, because the SG11 suggests it could be breakfast/lunch and any day. We should not create a conflict for that.

Quiz Step 7 - After specialization and pruningS1kimbreakfastfridaycheapnoSG1kim?fridaycheapnoSG11kim??cheapnoG1?????GS1kim????GS12kim???noGS121kim??cheapno

Now GS121 and SG11 are the same! That should be our converged final model.

That's right! :)

Identification Trees

  • Pick one feature and create tree based on end result binary. Here, restaurant for example.
  • For new example, one has to find the path that is closest to end result.
  • Also Decision tree learning - easy but trade off is that to know all 5 examples initially

2019-07-05_14h38_27.png

Optimal Decision Trees

  • One could have choosen any feature resulting in different tree possibilities. Clearly one could be most optimal as illustrated below.

2019-07-05_14h43_32.png

2019-07-05_14h43_43.png