合肥生活安徽新聞合肥交通合肥房產(chǎn)生活服務(wù)合肥教育合肥招聘合肥旅游文化藝術(shù)合肥美食合肥地圖合肥社保合肥醫(yī)院企業(yè)服務(wù)合肥法律

        代做CAP 4611、代寫C/C++,Java程序
        代做CAP 4611、代寫C/C++,Java程序

        時(shí)間:2025-04-28  來源:合肥網(wǎng)hfw.cc  作者:hfw.cc 我要糾錯(cuò)



        Final Exam
        Instructor: Amrit Singh Bedi
        Instructions
        This exam is worth a total of 100 points. Please answer all questions clearly
        and concisely. Show all your work and justify your answers.
        • For Question 1 and 2, please submit the PDF version of your solution
        via webcourses. You can either write it in latex or do it on paper and
        submit the scanned version. But if you do it on paper and scan it,
        you are responsible for ensuring it is readable and properly scanned.
        There will be zero marks if it is not clearly written or scanned.
        • The total time to complete the exam is 24 hours and it is due at 4:00
        pm EST, Friday (April 25th, 2025). This is a take-home exam. Please
        do not use AI like ChatGPT to complete the exam. There are zero
        marks if found (believe me, we would know if you use it).
        Question 1 50 marks
        Context: In supervised learning, understanding the bias-variance tradeoff
        is crucial for developing models that generalize well to unseen data.
        Problem 1 10 marks
        Define the terms bias, variance, and irreducible error in the context of su pervised learning. Explain how each contributes to the total expected error
        of a model.
        1
        Problem 2 20 marks
        Derive the bias-variance decomposition of the expected squared error for a
        regression problem. That is, show that:
        ED,ε[(y − f
        ˆ(x))2
        ] =  Bias[f
        ˆ(x)]
        2
        + Var[f
        ˆ(x)] + σ
        2
        where f
        ˆ(x) is the prediction of the model trained on dataset D, y = f(x)+ε,
        and σ
        2
        is the variance of the noise ε.
        Hint: You can start by taking y = f(x) + ε, where E[ε] = 0, and
        Var[ε] = σ
        2
        . Let f
        ˆ(x) be a learned function from the training set D. Then
        proceed towards the derivation.
        Problem 3 10 marks
        Consider two models trained on the same dataset:
        • Model A: A simple linear regression model.
        • Model B: A 10th-degree polynomial regression model.
        Discuss, in terms of bias and variance, the expected performance of each
        model on training data and unseen test data. Which model is more likely
        to overfit, and why?
        Problem 4 10 marks
        Explain how increasing the size of the training dataset affects the bias and
        variance of a model. Provide reasoning for your explanation. (10 marks)
        Question 2: Using Transformer Attention 50
        marks
        Context. Consider a simplified Transformer with a vocabulary of six to kens:
        • I (ID 0): embedding  1.0, 0.0

        • like (ID 1): embedding  0.0, 1.0

        • to (ID 2): embedding  1.0, 1.0

        2
        • eat (ID 3): embedding  0.5, 0.5

        • apples (ID 4): embedding  0.6, 0.4

        • bananas (ID 5): embedding  0.4, 0.6

        All three projection matrices are the 2 × 2 identity:
        WQ = WK = WV = I2.
        When predicting the next token, the model uses masked self-attention: the
        query comes from the last position, while keys and values come from all
        previous tokens. (Note: show step by step calculation for all questions
        below)
        (a) (10 marks) For the input sequence [I, like, to] (IDs [0, 1, 2]),
        compute the query, key and value vectors for each token.
        (b) (15 marks) Let Q be the query of the last token and K, V the keys
        and values of all three tokens.
        • Compute the row vector of raw attention scores qK⊤, where q is
        the query of the last token and K is the 3×2 matrix of keys. .
        • Scale by √
        dk (with dk = 2) and apply softmax to obtain attention
        weights.
        • Compute the context vector as the weighted sum of the values.
        (c) (15 marks) Given the context vector c ∈ R
        2
        from part (b), com pute the unnormalized score for each vocabulary embedding via c ·
        embed(w), i.e. dot-product.
        • Apply softmax over these six scores to get a probability distribu tion.
        • Which token has the highest probability? [Note: Because the six
        embeddings are synthetic and not trained on real text, the token
        that receives the highest probability may look ungrammatical in
        normal English; this is an artifact of the toy setup.]
        (d) (10 marks) Explain why the model selects the token you found in
        (c). In your answer, discuss:
        • How the attention weights led to that choice.
        • Explain why keys/values may include the current token but never
        future tokens .
        3

        請(qǐng)加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp

        掃一掃在手機(jī)打開當(dāng)前頁
      1. 上一篇:代做ISYS1001、代寫C++,Java程序
      2. 下一篇:返回列表
      3. ·代做ISYS1001、代寫C++,Java程序
      4. ·代做COMP2221、代寫Java程序設(shè)計(jì)
      5. ·代寫MATH3030、代做c/c++,Java程序
      6. ·COMP 5076代寫、代做Python/Java程序
      7. ·代寫COP3503、代做Java程序設(shè)計(jì)
      8. ·COMP3340代做、代寫Python/Java程序
      9. ·COM1008代做、代寫Java程序設(shè)計(jì)
      10. ·MATH1053代做、Python/Java程序設(shè)計(jì)代寫
      11. ·CS209A代做、Java程序設(shè)計(jì)代寫
      12. ·ITC228編程代寫、代做Java程序語言
      13. 合肥生活資訊

        合肥圖文信息
        出評(píng) 開團(tuán)工具
        出評(píng) 開團(tuán)工具
        挖掘機(jī)濾芯提升發(fā)動(dòng)機(jī)性能
        挖掘機(jī)濾芯提升發(fā)動(dòng)機(jī)性能
        戴納斯帝壁掛爐全國(guó)售后服務(wù)電話24小時(shí)官網(wǎng)400(全國(guó)服務(wù)熱線)
        戴納斯帝壁掛爐全國(guó)售后服務(wù)電話24小時(shí)官網(wǎng)
        菲斯曼壁掛爐全國(guó)統(tǒng)一400售后維修服務(wù)電話24小時(shí)服務(wù)熱線
        菲斯曼壁掛爐全國(guó)統(tǒng)一400售后維修服務(wù)電話2
        美的熱水器售后服務(wù)技術(shù)咨詢電話全國(guó)24小時(shí)客服熱線
        美的熱水器售后服務(wù)技術(shù)咨詢電話全國(guó)24小時(shí)
        海信羅馬假日洗衣機(jī)亮相AWE  復(fù)古美學(xué)與現(xiàn)代科技完美結(jié)合
        海信羅馬假日洗衣機(jī)亮相AWE 復(fù)古美學(xué)與現(xiàn)代
        合肥機(jī)場(chǎng)巴士4號(hào)線
        合肥機(jī)場(chǎng)巴士4號(hào)線
        合肥機(jī)場(chǎng)巴士3號(hào)線
        合肥機(jī)場(chǎng)巴士3號(hào)線
      14. 上海廠房出租 短信驗(yàn)證碼 酒店vi設(shè)計(jì)

        主站蜘蛛池模板: 午夜精品一区二区三区在线观看| 久久国产一区二区三区| 亚洲啪啪综合AV一区| 中文字幕一区在线观看视频| 国产午夜精品一区二区| 日本高清一区二区三区| 亚洲国产美国国产综合一区二区| 无码人妻一区二区三区一| 在线播放国产一区二区三区| 中文字幕一区日韩在线视频| 日本高清天码一区在线播放| 性色AV一区二区三区天美传媒| 日韩人妻精品无码一区二区三区| 中文字幕Av一区乱码| 日韩欧国产精品一区综合无码| 国产精品污WWW一区二区三区| 国产av夜夜欢一区二区三区| 国产成人精品一区二三区熟女| 亚洲一区精彩视频| 中文字幕乱码人妻一区二区三区| 久久久精品人妻一区二区三区蜜桃| 麻豆精品久久久一区二区| 国产亚洲福利精品一区| 日本一区二区三区在线观看| 免费看一区二区三区四区| 国产探花在线精品一区二区| 日本精品一区二区三区在线观看| 三上悠亚一区二区观看| 日本一区二区不卡视频| 久久精品午夜一区二区福利 | 国产在线步兵一区二区三区| 日韩一区二区三区免费体验| 日本中文字幕在线视频一区 | 亚洲区精品久久一区二区三区| 中文字幕精品一区二区日本| 3d动漫精品啪啪一区二区免费| 色噜噜狠狠一区二区三区| 精品国产免费一区二区| 亚洲国产精品第一区二区三区| 国产综合一区二区在线观看| 一区二区视频在线|