# AIOU Solved Assignments 2 Code 8614 Autumn & Spring 2021

## AIOU Solved Assignments Code 8614 Autumn & Spring 2021

Contents

AIOU Solved Assignments 1 & 2 Code 8614 Autumn & Spring 2021. Solved Assignments code 8614 Educational Statistics 2021. Allama iqbal open university old papers.

Assignment No. 2

Autumn & Spring 2021

Educational Statistics (8614)

Q.1Whatisdatacleaning?Writedownitsimportanceandbenefits.Howtoensureitbefore analysisofdata?

DataCleaning

‘Cleaning’referstotheprocessofremovinginvaliddatapointsfromadataset.

Manystatisticalanalysestrytofindapatterninadataseries,basedonahypothesisor assumptionaboutthenatureofthedata.’Cleaning’istheprocessofremovingthose datapointswhichareeither(a)Obviouslydisconnectedwiththeeffectorassumption whichwearetryingtoisolate,duetosomeotherfactorwhichappliesonlytothose particulardatapoints.(b)Obviouslyerroneous,i.e.someexternalerrorisreflectedin thatparticulardatapoint,eitherduetoamistakeduringdatacollection,reportingetc.

Intheprocessweignoretheseparticulardatapoints,andconductouranalysisonthe remainingdata.

‘Cleaning’frequentlyinvolveshumanjudgementtodecidewhichpointsarevalidand whicharenot,andthereisachanceofvaliddatapointscausedbysomeeffectnot sufficientlyaccountedforinthehypothesis/assumptionbehindtheanalyticalmethod applied.

Thepointstobecleanedaregenerallyextremeoutliers.’Outliers’arethosepointswhich standoutfornotfollowingapatternwhichisgenerallyvisibleinthedata.Onewayof detectingoutliersistoplotthedatapoints(ifpossible)andvisuallyinspectthe resultantplotforpointswhichliefaroutsidethegeneraldistribution.Anotherwayisto runtheanalysisontheentiredataset,andtheneliminatingthosepointswhichdonot meetmathematical’controllimits’forvariabilityfromatrend,andthenrepeatingthe analysisontheremainingdata.

Cleaningmayalsobedonejudgementally,forexampleinasalesforecastbyignoring

historicaldatafromanarea/unitwhichhasatendencytomisreportsalesfigures.To takeanotherexample,inadoubleblindmedicaltestadoctormaydisregardtheresults ofavolunteerwhomthedoctorhappenstoknowinanon-professionalcontext.

‘Cleaning’mayalsosometimesbeusedtorefertovariousother judgemental/mathematicalmethodsofvalidatingdataandremovingsuspectdata.

Theimportanceofhavingcleanandreliabledatainanystatisticalanalysiscannotbe stressedenough.Often,inreal-worldapplicationstheanalystmaygetmesmerisedby thecomplexityorbeautyofthemethodbeingapplied,whilethedataitselfmaybe unreliableandleadtoresultswhichsuggestcoursesofactionwithoutasoundbasis.A goodstatistician/researcher(personalopinion)spends90%ofhis/hertimeon collectingandcleaningdata,anddevelopinghypothesiswhichcoverasmanyexternal explainablefactorsaspossible,andonly10%ontheactualmathematicalmanipulation ofthedataandderivingresults.

BenefitsandImportanceofDataCleaning

Datacleansingistheprocessofrecognizingmistakenorunethicaldatafroma database.Theprocessismainlyusedindatabaseswhereimproper,unfinished, inaccurateorirrelevantpartofthedataisidentifiedandthenaltered,replacedor deleted.Businessenterpriseslargelydependondatawhetheritisthehonestyof customers’addressesorensuringaccurateinvoicesareemailedorpostedtothe recipients.Toensurethatthecustomerdataisusedinthemostproductiveand meaningfulmannerthatcanincreasethefundamentalvalueofthebrand,business enterprisesmustgiveimportancetodataquality.

Managingandensuringthatthedataiscleancanprovidesubstantialgrowthtothe business.Businessenterprisescanfacelotsofhasslessuchashighcostinvolvedin processingerrors,manualtroubleshooting,incorrectinvoicedataandshipmentsto wrongaddress.Theinformationofthecustomerisforeverchangingduetorelocation orotherfactorswhichhavetobechangedandtheupdatedinformationmustreflectin thedatabase.Businessenterprisescanachieveawiderangeofbenefitsbycleansing datawhichcanleadtoloweringoperationalcostsandmaximizingprofits.

Herearethebenefitsofdatacleaning:

1.ImprovestheEfficiencyofCustomerAcquisitionActivities:

Businessenterprisescansignificantlyboosttheircustomeracquisitioneffortsby cleaningtheirdataasamoreefficientprospectslisthavingaccuratedatacanbe

created.Throughoutthemarketingprocess,businessenterprisesmustensurethatthe dataisclean,up-to-dateandaccuratebyregularlyfollowingdataqualityroutines.Multi- channelcustomerdatacanalsobemanagedseamlesslywhichprovidestheenterprise withanopportunitytocarryoutsuccessfulmarketingcampaignsinthefutureasthey wouldbeawareofthemethodstoeffectivelyreachouttotheirtargetaudience.

2.ImprovesDecisionMakingProcess:

Thekeystoneofeffectivedecisionmakinginabusinessenterpriseiscustomerdata. Preciseinformationanddataqualityareessentialtodecisionmaking.Datacleansing cansupportbetteranalyticsaswellasall-roundbusinessintelligencewhichcan facilitatebetterdecisionmakingandexecution.Intheend,havingaccuratedatacan helpbusinessenterprisesmakebetterdecisionswhichwillcontributetothesuccessof thebusinessinthelongrun.

3.StreamlinesBusinessPractices:

Datacleaningalongwiththerightanalyticscanalsohelptheenterprisetoidentifyan opportunitytolaunchanewproductorserviceinthemarketwhichtheconsumers mightlike,oritcanhighlightvariousmarketingavenuesthattheenterprisescantry.For example,ifamarketingcampaignisunsuccessful,thebusinessenterprisecanlookat variousothermarketingchannelsthathavethebestcustomerresponsedataand implementthem.

4.IncreasesProductivity:

Havingacleanandproperlymaintaineddatabasecanhelpbusinessenterprisesto ensurethattheemployeesaremakingthebestuseoftheirworkhours.Itcanalso preventthestaffoffromcontactingcustomerswithoutdatedinformationorcreate invalidvendorfilesinthesystembyhelpingthemtoworkwithcleanrecordsthereby maximizingthestaff’sefficiencyandproductivity.

### AIOU Solved Assignments 2 Code 8614 Autumn & Spring 2021

——————-

Q.2Whatismeasureofdifference?Explaindifferenttypesoftestsindetailwithexamples.How arethesetestsusedinhypothesistesting?

InStatistics,deviationisameasureofthedifferencebetweentheobservedvalueofavariable andsomeothervalue,oftenbutnotnecessarilythatvariable’smean.

Forquantitativevariablesyoucanmeasurethisdeviationinseveralwaysincluding:

•Rangewhichisthedifferencebetweenthelargestandsmallestvaluesofthedataset.

•Standarddeviationdefinedasthesquarerootoftheaverageofthesquareddeviationsofthe valuesfromtheiraverage.

•CoefficientofvariationdefinedastheratioofthestandarddeviationtothemeanInpredictive modelssuchasalinearregression,deviationisdefinedasthedifferencebetweenthe measuredvalueandthepredictedone.

TypesofTests

T-Test

At-testisausefulstatisticaltechniqueusedforcomparingmeanvaluesoftwodatasets obtainedfromtwogroups.Thecomparisontellsuswhetherthesedatasetsaredifferentfrom eachother.Itfurthertellsushowsignificantthedifferencesareandifthesedifferencescould havehappenedbychance.Thestatisticalsignificanceoft-testindicateswhetherornotthe differencebetweenthemeanoftwogroupsmostlikelyreflectsarealdifferenceinthe populationfromwhichthegroupsareselected.

t-testsareusedwhentherearetwogroups(maleandfemale)ortwosetsofdata(beforeand after),andtheresearcherwishestocomparethemeanscoreonsomecontinuousvariable.

AnalysisofVariance(ANOVA)

Thet-testshaveoneveryseriouslimitation–theyarerestrictedtotestsofthesignificanceof thedifferencebetweenonlytwogroups.Therearemanytimeswhenweliketoseeifthereare significantdifferencesamongthree,four,orevenmoregroups.Forexamplewemaywantto investigatewhichofthreeteachingmethodsisbestforteachingninthclassalgebra.Insuch case,wecannotuset-testbecausemorethantwogroupsareinvolved.Todealwithsuchtype ofcasesoneofthemostusefultechniquesinstatisticsisanalysisofvariance(abbreviatedas ANOVA).ThistechniquewasdevelopedbyaBritishStatisticianRonaldA.Fisher(Dietz&Kalof, 2009;Bartz,1981)

AnalysisofVariance(ANOVA)isahypothesistestingprocedurethatisusedtoevaluatemean differencesbetweentwoormoretreatments(orpopulation).Likeallotherinferential procedures.ANOVAusessampledatatoasabasisfordrawinggeneralconclusionabout populations.Sometime,itmayappearthatANOVAandt-testaretwodifferentwaysofdoing exactlysamething:testingformeandifferences.Insomecasedthisistrue–bothtestsuse sampledatatotesthypothesisaboutpopulationmean.However,ANOVAhasmuchmore advantagesovert-test.t-testsareusedwhenwehavecompareonlytwogroupsorvariables (oneindependentandonedependent).OntheotherhandANOVAisusedwhenwehavetwoor morethantwoindependentvariables(treatment).Supposewewanttostudytheeffectsof threedifferentmodelsofteachingontheachievementofstudents.Inthiscasewehavethree differentsamplestobetreatedusingthreedifferenttreatments.SoANOVAisthesuitable techniquetoevaluatethedifference.

ChiSquareTest

TheChiSquareTestisatestthatinvolvestheuseofparameterstotestthestatistical significanceoftheobservationsunderstudy.

Thetaskofthechisquaretestistotestthestatisticalsignificanceoftheobservedrelationship withrespecttotheexpectedrelationship.Thechisquarestatisticisusedbytheresearcherfor determiningwhetherornotarelationshipexists.

Inthechisquaretest,thenullhypothesisisassumedastherenotbeinganassociationbetween thetwovariablesthatareobservedinthestudy.Thechisquaretestiscalculatedbyevaluating thecellfrequenciesthatinvolvetheexpectedfrequenciesinthosetypesofcaseswhenthereis noassociationbetweenthevariables.Thecomparisonbetweentheexpectedtypeoffrequency andtheactualobservedfrequencyisthenmadeinthistest.Thecomputationoftheexpected frequencysquaretestiscalculatedastheproductofthetotalnumberofobservationsinthe rowandthecolumn,whichisdividedbythetotalsizeofthesample.

Thecalculationofthestatisticinthechisquaretestisdonebycomputingthesumofthe squareofthedeviationbetweentheobservedandtheexpectedfrequency,whichisdividedby theexpectedfrequency.

Theresearchershouldknowthatthegreaterthedifferencebetweentheobservedandexpected cellfrequency,thelargerthevalueofthechisquarestatisticinthechisquaretest.

Inordertodetermineiftheassociationbetweenthetwovariablesexists,theprobabilityof obtainingavalueofchisquareshouldbelargerthantheoneobtainedfromthechisquaretest ofcrosstabulation.

Thereisonemorepopulartestcalledthechisquaretestforgoodnessoffit.

Thistypeoftestcalledthechisquaretestforgoodnessoffithelpstheresearcherto understandwhetherornotthesampledrawnfromacertainpopulationhasaspecific distributionandwhetherornotitactuallybelongstothatspecifieddistribution.Thistypeoftest canbeapplicabletoonlydiscretetypesofdistribution,likePoisson,binomial,etc.Thistypeof chisquaretestisanalternativetestforthenonparametrictestcalledtheKolmogorovSmrinov goodnessoffittest.

Thenullhypothesisassumedbytheresearcherinthistypeofchisquaretestisthatthedata drawnfromthepopulationfollowsthespecifieddistribution.Thechisquarestatisticinthistest isdefinedinasimilarmannertothedefinitionintheabovetypeoftest.Oneoftheimportant pointstobenotedbytheresearcheristhattheexpectednumberoffrequenciesinthistypeof chisquaretestshouldbeatleastfive.Thismeansthatthechisquaretestwillnotbevalidfor thosewhoseexpectedcellfrequencyislessthanfive.

Therearecertainassumptionsinthechisquaretest.

Therandomsamplingofdataisassumedinthechisquaretest.

Inthechisquaretest,asamplewithasufficientlylargesizeisassumed.Ifthechisquaretestis conductedonasamplewithasmallersize,thenthechisquaretestwillyieldinaccurate inferences.Theresearcher,byusingthechisquaretestonsmallsamples,mightendup committingaTypeIIerror.

Inthechisquaretest,theobservationsarealwaysassumedtobeindependentofeachother.

Inthechisquaretest,theobservationsmusthavethesamefundamentaldistribution.

### AIOU Solved Assignments Code 8614 Autumn & Spring 2021

——————–

Q.3Explaintheconceptofreliability.Explaintypesofreliabilityandmethodsusedtocalculate eachtype.

Reliabilityisthedegreetowhichanassessmenttoolproducesstableandconsistent results.Reliabilityinstatisticsandpsychometricsistheoverallconsistencyofameasure.A measureissaidtohaveahighreliabilityifitproducessimilarresultsunderconsistent conditions.”Itisthecharacteristicofasetoftestscoresthatrelatestotheamountofrandom errorfromthemeasurementprocessthatmightbeembeddedinthescores.Scoresthatare highlyreliableareaccurate,reproducible,andconsistentfromonetestingoccasiontoanother. Thatis,ifthetestingprocesswererepeatedwithagroupoftesttakers,essentiallythesame resultswouldbeobtained.Variouskindsofreliabilitycoefficients,withvaluesrangingbetween 0.00(mucherror)and1.00(noerror),areusuallyusedtoindicatetheamountoferrorinthe scores.”Forexample,measurementsofpeople’sheightandweightareoftenextremelyreliable.

TypesofReliability

Test-retestreliabilityisameasureofreliabilityobtainedbyadministeringthesametesttwice overaperiodoftimetoagroupofindividuals.ThescoresfromTime1andTime2canthenbe correlatedinordertoevaluatethetestforstabilityovertime.

Example:Atestdesignedtoassessstudentlearninginpsychologycouldbegiventoagroupof studentstwice,withthesecondadministrationperhapscomingaweekafterthefirst.The obtainedcorrelationcoefficientwouldindicatethestabilityofthescores.

Parallelformsreliabilityisameasureofreliabilityobtainedbyadministeringdifferentversions ofanassessmenttool(bothversionsmustcontainitemsthatprobethesameconstruct,skill, knowledgebase,etc.)tothesamegroupofindividuals.Thescoresfromthetwoversionscan thenbecorrelatedinordertoevaluatetheconsistencyofresultsacrossalternateversions.

Example:Ifyouwantedtoevaluatethereliabilityofacriticalthinkingassessment,youmight createalargesetofitemsthatallpertaintocriticalthinkingandthenrandomlysplitthe questionsupintotwosets,whichwouldrepresenttheparallelforms.

Inter-raterreliabilityisameasureofreliabilityusedtoassessthedegreetowhichdifferent judgesorratersagreeintheirassessmentdecisions.Inter-raterreliabilityisusefulbecause humanobserverswillnotnecessarilyinterpretanswersthesameway;ratersmaydisagreeasto howwellcertainresponsesormaterialdemonstrateknowledgeoftheconstructorskillbeing assessed.

Example:Inter-raterreliabilitymightbeemployedwhendifferentjudgesareevaluatingthe degreetowhichartportfoliosmeetcertainstandards.Inter-raterreliabilityisespeciallyuseful whenjudgmentscanbeconsideredrelativelysubjective.Thus,theuseofthistypeofreliability wouldprobablybemorelikelywhenevaluatingartworkasopposedtomathproblems.

Internalconsistencyreliabilityisameasureofreliabilityusedtoevaluatethedegreetowhich differenttestitemsthatprobethesameconstructproducesimilarresults.

Averageinter-itemcorrelationisasubtypeofinternalconsistencyreliability.Itisobtainedby takingalloftheitemsonatestthatprobethesameconstruct(e.g.,readingcomprehension), determiningthecorrelationcoefficientforeachpairofitems,andfinallytakingtheaverageof allofthesecorrelationcoefficients.Thisfinalstepyieldstheaverageinter-itemcorrelation.

Split-halfreliabilityisanothersubtypeofinternalconsistencyreliability.Theprocessof obtainingsplit-halfreliabilityisbegunby“splittinginhalf”allitemsofatestthatareintendedto probethesameareaofknowledge(e.g.,WorldWarII)inordertoformtwo“sets”ofitems.The entiretestisadministeredtoagroupofindividuals,thetotalscoreforeach“set”iscomputed, andfinallythesplit-halfreliabilityisobtainedbydeterminingthecorrelationbetweenthetwo total“set”scores.

### AIOU Solved Assignments 2 Autumn 2021 Code 8614

——————–

Q.4Whatiscorrelation?Howlevelofmeasurementhelpusinselectingcorrecttypeof correlation?Writecomprehensivenoteonrangeofcorrelationcoefficientandwhatdoesit explain?Canwepredictfuturecorrelationbycurrentrelationship?Ifyes,thenhow?

CORRELATION

Correlationisastatisticaltechniqueusedtomeasureanddescriberelationshipbetweentwo variables.Thesevariablesareneithermanipulatednorcontrolled,rathertheysimplyare observedastheynaturallyexistintheenvironment.Supposearesearcherisinterestedin

relationshipbetweennumberofchildreninafamilyandIQoftheindividualchild.Hewouldtake agroupofstudentscomingfromdifferentfamilies.Thenhesimplyobserveorrecordthe numberofchildreninafamilyandthenmeasureIQscoreofeachindividualstudentsamegroup. Hewillneithermanipulatenorcontrolanyvariable.Correlationrequirestwoseparatescoresfor eachindividual(onescorefromeachoftwovariables).Thesescoresarenormallyidentifiedas XandYandcanbepresentedinatableorinagraph.

Correlationisusedtotestrelationshipsbetweenquantitativevariablesorcategoricalvariables. Inotherwords,it’sameasureofhowthingsarerelated.Thestudyofhowvariablesare correlatediscalledcorrelationanalysis.

Someexamplesofdatathathaveahighcorrelation:

*Yourcaloricintakeandyourweight.

*Youreyecolorandyourrelatives’eyecolors.

*TheamountoftimeyourstudyandyourGPA.

Someexamplesofdatathathavealowcorrelation(ornoneatall):

*Yoursexualpreferenceandthetypeofcerealyoueat.

*Adog’snameandthetypeofdogbiscuittheyprefer.

*Thecostofacarwashandhowlongittakestobuyasodainsidethestation.

Correlationsareusefulbecauseifyoucanfindoutwhatrelationshipvariableshave,youcan makepredictionsaboutfuturebehavior.Knowingwhatthefutureholdsisveryimportantinthe socialscienceslikegovernmentandhealthcare.Businessesalsousethesestatisticsfor budgetsandbusinessplans.

CORRELATIONCOEFFICIENT

The”correlationcoefficient”wascoinedbyKarlPearsonin1896.Accordingly,thisstatisticis overacenturyold,andisstillgoingstrong.Itisoneofthemostusedstatisticstoday;secondto themean.Thecorrelationcoefficient’sweaknessesandwarningsofmisusearewell documented.Asafifteen-yearpracticedconsultingstatistician,whoalsoteachesstatisticians continuingandprofessionalstudiesfortheDatabaseMarketing/DataMiningIndustry,Iseetoo oftentheweaknessesandwarningsarenotheeded.Amongtheweaknesses/uses,thereisone thatisrarelymentioned:thecorrelationcoefficientinterval[-1,+1]isrestrictedbytheindividual distributionsofthetwovariablesbeingcorrelated.Thepurposeofthisarticleis:1)tointroduce theaffectsthedistributionsofthetwoindividualvariableshaveonthecorrelationcoefficient interval;and2)thusly,toprovideaprocedureforcalculatinganadjustedcorrelationcoefficient,

whoserealizedcorrelationcoefficientintervalisoftenshorterthantheoriginalone.

BasicsoftheCorrelationCoefficient

Thecorrelationcoefficient,denotedbyr,isameasureofthestrengthofthestraight-lineor linearrelationshipbetweentwovariables.Thewellknowncorrelationcoefficientisoften misusedbecauseitslinearityassumptionisnottested.Thecorrelationcoefficientcan–by definition,i.e.,theoretically–assumeanyvalueintheintervalbetween+1and-1,includingthe endvaluesplus/minus1.

Thefollowingpointsaretheacceptedguidelinesforinterpretingthecorrelationcoefficient:

0indicatesnolinearrelationship.

+1indicatesaperfectpositivelinearrelationship:asonevariableincreasesinitsvalues,the othervariablealsoincreasesinitsvaluesviaanexactlinearrule.

-1indicatesaperfectnegativelinearrelationship:asonevariableincreasesinitsvalues,the othervariabledecreasesinitsvaluesviaanexactlinearrule.

Prediction

Iftwovariablesareknowntoberelatedinsomesystematicway,itispossibletouseone variabletomakepredictionabouttheother.Forexample,whenastudentseeksadmissionina college,heisrequiredtosubmitagreatdealofpersonalinformation,includinghisscoresin SSCannual/supplementaryexamination.Thecollegeofficialswantthisinformationsothatthey canpredictthatstudent’schanceofsuccessincollege

#### AIOU Solved Assignments 2 Code 8614 Autumn & Spring 2021

———————-

Q.5Explainthefollowingtermswithexamples.

a) DegreeofFreedom

DegreesofFreedom

Theconceptofdegreesoffreedomiscentraltotheprincipleofestimatingstatisticsof populationsfromsamplesofthem.”Degreesoffreedom”iscommonlyabbreviatedtodf.

Thinkofdfasamathematicalrestrictionthatneedstobeputinplacewhenestimatingone statisticfromanestimateofanother.

Letustakeanexampleofdatathathavebeendrawnatrandomfromanormaldistribution. Normaldistributionsneedonlytwoparameters(meanandstandarddeviation)fortheir definition;e.g.thestandardnormaldistributionhasameanof0andstandarddeviation(sd)of1. Thepopulationvaluesofmeanandsdarereferredtoasmuandsigmarespectively,andthe sampleestimatesarex-barands.

Inordertoestimatesigma,wemustfirsthaveestimatedmu.Thus,muisreplacedbyx-barin theformulaforsigma.Inotherwords,weworkwiththedeviationsfrommuestimatedbythe deviationsfromx-bar.Atthispoint,weneedtoapplytherestrictionthatthedeviationsmust sumtozero.Thus,degreesoffreedomaren-1intheequationforsbelow:

Standarddeviationinapopulationis:

[xisavaluefromthepopulation,?isthemeanofallx,nisthenumberofxinthepopulation,? isthesummation]

Theestimateofpopulationstandarddeviationcalculatedfromarandomsampleis:

[xiistheithobservationfromasampleofthepopulation,x-baristhesamplemean,nisthe samplesize,?isthesummation]

Whenthisprincipleofrestrictionisappliedtoregressionandanalysisofvariance,thegeneral resultisthatyouloseonedegreeoffreedomforeachparameterestimatedpriortoestimating the(residual)standarddeviation.

Anotherwayofthinkingabouttherestrictionprinciplebehinddegreesoffreedomistoimagine contingencies.Forexample,imagineyouhavefournumbers(a,b,candd)thatmustadduptoa totalofm;youarefreetochoosethefirstthreenumbersatrandom,butthefourthmustbe chosensothatitmakesthetotalequaltom-thusyourdegreeoffreedomisthree.

b) SpreadofScores

Measuresofspreaddescribehowsimilarorvariedthesetofobservedvaluesarefora particularvariable(dataitem).Measuresofspreadincludetherange,quartilesandthe interquartilerange,varianceandstandarddeviation.

Whencanwemeasurespread?

Thespreadofthevaluescanbemeasuredforquantitativedata,asthevariablesarenumeric andcanbearrangedintoalogicalorderwithalowendvalueandahighendvalue.

Whydowemeasurespread?

Summarisingthedatasetcanhelpusunderstandthedata,especiallywhenthedatasetislarge. AsdiscussedintheMeasuresofCentralTendencypage,themode,median,andmean summarisethedataintoasinglevaluethatistypicalorrepresentativeofallthevaluesinthe dataset,butthisisonlypartofthe’picture’thatsummarisesadataset.Measuresofspread summarisethedatainawaythatshowshowscatteredthevaluesareandhowmuchtheydiffer fromthemeanvalue.

c) Sample

Instatistics,you’llbeworkingwithsamples.Asampleisjustapartofapopulation.Forexample, ifyouwanttofindouthowmuchtheaverageAmericanearns,youaren’tgoingtowantto surveyeveryoneinthepopulation(over300millionpeople),soyouwouldchooseasmall numberofpeopleinthepopulation.Forexample,youmightselect10,000people.

FindingaSample

Technically,youcan’tjustchoose10,000people.Inorderforittobestatistical(i.e.onethatyou canuseinstatistics),theactualsizemustbefoundusingastatisticalmethod.Tenthousand peoplemightnotbetheoptimalamountforvalidsurveyresults:youmayneedmore,orless. Therearemany,manywaystofindsamplesizes,includingusingdatafrompriorexperimentsor usingasizecalculator.Howyoufindasamplesizecanbequitecomplex,dependingonwhat youwanttodowithyourdata.Youcanfindoutmoreabouthowtofindthemhere:Samplesize: Howtofindit.

Methods

Ifyou’vedecidedtoassembleyoursamplefromscratch(forexample,youaren’tusingprior data),thenyouneedtochooseasamplingmethod.Whichsamplingmethodyouusedepends onwhatresourcesandinformationyouhaveavailable.Forexample,thedraftworkedby drawingrandombirthdates,amethodcalledsimplerandomsampling.Inorderforthattowork, thegovernmentneededalistofeverypotentialdraftee’snameanddateofbirth.Thedraftcould alsohaveusedsystematicsampling,drawingthenthnamefromalist(forexample,every100th name).Forthattohaveworked,allthenamesmustfirsthavebeencompiledonalist

d) ConfidenceInterval

Whatareconfidenceintervals?

Howdoweformaconfidenceinterval?

Thepurposeoftakingarandomsamplefromalotorpopulationandcomputingastatistic,such asthemeanfromthedata,istoapproximatethemeanofthepopulation.Howwellthesample statisticestimatestheunderlyingpopulationvalueisalwaysanissue.Aconfidenceinterval addressesthisissuebecauseitprovidesarangeofvalueswhichislikelytocontainthe populationparameterofinterest.

Confidencelevels

Confidenceintervalsareconstructedataconfidencelevel,suchas95%,selectedbytheuser. Whatdoesthismean?Itmeansthatifthesamepopulationissampledonnumerousoccasions andintervalestimatesaremadeoneachoccasion,theresultingintervalswouldbracketthetrue

populationparameterinapproximately95%ofthecases.Aconfidencestatedata1??level canbethoughtofastheinverseofasignificancelevel,?.

Oneandtwo-sidedconfidenceintervals

Inthesamewaythatstatisticaltestscanbeoneortwo-sided,confidenceintervalscanbeone ortwo-sided.Atwo-sidedconfidenceintervalbracketsthepopulationparameterfromabove andbelow.Aone-sidedconfidenceintervalbracketsthepopulationparametereitherfrom aboveorbelowandfurnishesanupperorlowerboundtoitsmagnitude.

Exampleofatwo-sidedconfidenceinterval

Forexample,a100(1??)%confidenceintervalforthemeanofanormalpopulationis

whereY ?isthesamplemean,z1??/2isthe1??/2criticalvalueofthestandardnormal distributionwhichisfoundinthetableofthestandardnormaldistribution,?istheknown populationstandarddeviation,andNisthesamplesize.

e) ZScore

z-score

Az-score(aka,astandardscore)indicateshowmanystandarddeviationsanelementisfrom themean.Az-scorecanbecalculatedfromthefollowingformula.

z=(X-?)/?

wherezisthez-score,Xisthevalueoftheelement,?isthepopulationmean,and?isthe standarddeviation.

Hereishowtointerpretz-scores.

Az-scorelessthan0representsanelementlessthanthemean.

Az-scoregreaterthan0representsanelementgreaterthanthemean.

Az-scoreequalto0representsanelementequaltothemean.

Az-scoreequalto1representsanelementthatis1standarddeviationgreaterthanthemean;a z-scoreequalto2,2standarddeviationsgreaterthanthemean;etc.

Az-scoreequalto-1representsanelementthatis1standarddeviationlessthanthemean;az- scoreequalto-2,2standarddeviationslessthanthemean;etc.

Ifthenumberofelementsinthesetislarge,about68%oftheelementshaveaz-scorebetween -1and1;about95%haveaz-scorebetween-2and2;andabout99%haveaz-scorebetween-3 and3.

Example

Samplequestion:YoutaketheSATandscore1100.ThemeanscorefortheSATis1026and thestandarddeviationis209.Howwelldidyouscoreonthetestcomparedtotheaveragetest taker?

Step1:WriteyourX-valueintothez-scoreequation.ForthissamplequestiontheX-valueisyour SATscore,1100.

Z=1100-?/?

Step2:Putthemean,?,intothez-scoreequation.

Z=1100-1026/?

Step3:Writethestandarddeviation,?intothez-scoreequation.

Z=1100-1026/209

Step4:Calculatetheanswerusingacalculator:

(1100–1026)/209=.354.Thismeansthatyourscorewas.354stddevsabovethemean.

Step5:(Optional)Lookupyourz-valueinthez-tabletoseewhatpercentageoftest-takers scoredbelowyou.Az-scoreof.354is.1368+.5000*=.6368or63.68%.

AIOU Solved Assignments 2 Autumn & Spring 2021 Code 8614

———————